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ELECTRONIC MESSAGE FILTER HAVING AWHITELIST 
DATABASE AND A QUARANTINING MECHANISM 

This application ?S a eontinuarinnrin-part of U.S. Serial No, 09/548,322, filed 
on April i% 200ft, which is a continMlioftrin-part of U.S. Serial No. 09/447,590, filed 
November 23, 1999, which are incorporated herds fey reference, 

BACKGROUND OF THE INVENTION 
Field, .of the..lay&ntbn 

This invention generally concerns electronic messaging. In particular, the 
present invention concerns a system for filtering underired electronic mail, 

Beseripiion of the Related Art 

Generally, the term "spam" has come to refer to posting electronic messages to 
news groups or mailing to addresses on an address list the same message an 
uaacceptably large number (generally, 20-25} of times. As used herein, the term 
"spam" or "junk mmf refers to the sending of unsolicited electronic messages (or 
"email") to a large number of users on the Internet This includes email 
advertisements, sometimes referred teas Onsoheited Commercial Email (UCE), as 
well as non-commercial bulk email that advocates some political or social position. A 
"spammer" is a person or organisation that generates the junk mail. 

The principal objepti^tojunk'Wtail is that it is theft of an organization's 
resources, such as time spent by employees to open each message, classify it 
(legitimate vs. ju^),,^'d@1^:thei,m^^ge> -Trmftj is also spent by employees 
following tip mi ad vertising content while on the job. In addition, there Is an 
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increased security risk ftom visiting w$% sites advertised wt email 'messages, 
Employees may also be deceived mto a<tsng imiJroijrfy, such as to release 
confidential information due to a forged message Still -yet, there is a loss of the 
network administrator's time to deal with spam and forged messages, as well as fee 
u se of network bandwidth, disk space, and system memory required to store the 
message. Finally, in the process of deleting junk mail, users may inadvertently 
discard or overlook other important messages. Another objection to junk mail is that 
it is frequently used to advertise objectionable, fraudulent, or dangerous content, such 
as pornography, illegal pyramid schemes or to propagate financial scams. 

Spam can also be a serious security problem. For instance, the recent Melissa 
virus and ExpioreZip.worm have been spread almost exclusively via email 
attachments. Such viruses are usually dangerous only if me user opens the attachment 
that eamains the malicious code, hut many users open such attachments. 

Email may also he used to download or activate dangerous code, such as lava 
applets, javascript, and ActiveX controls. Email programs that support Hypertext 
Markup Language (HTML) can download malicious Java applets or scripts that 
execute with the mail user's privileges and permissions. Email has also been used to 
activate certain powerful ActiveX /controls that were distributed with certain operating 
systems and browsers. In this ease^ the code is already on the user's system, but is 
invoked in a way thai is dangerous. For instance, this existing code can be invoked by 
an email message to msMl a computer virus, turn offseeority checking, or to read, 
modify; or delete any information on the user's disk drive. 

Both spammers, and ^0§0 : '^i6;|pf^uc^ ; ^iai»uS ; c»de^ typically attempt to 
hide their identities when they distribute mail or code. Instead of mailing directly 
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from as easily-traced account at a major Internet provider, they may, For instance, 
send their mail from a spam#iendly network, using forged headers, and relay tire 
message -through intemtediaie.hosik Consequently, the same meehatdsms that can.be 
used to block spam cm also he used to provide a layer .of protection for keeping 
malicious code out of m organization's internal network. 

Simple Mail Transfer hro-ocoi (SMTP) 

Simple Mail Transfer Protocol (SMTP) is the predominant email protocol 
used on the Internet. As described w Request for Comments (RFC) 821, SMTP 
provides for the transfer of electronic mail from a sending SMTP agent to a receiving 
SMTP agent, SMTP is most commonly used with the Transmission Control 
Proteodl/Iniernet Protocol (TCP/IP) to transfer email between Internet hosts known as 
Message Transfer Agents (MTAs). As shown in Figure i, Internet mail operates at 
two distinct levels: the User Agent (UA) and the MTA, User Agent programs provide 
a human interlace to the mail system and are concerned with sending, reading, editing, 
and saving email messages. Message Transfer Agents handle the details of sending 
email across the Internet 

According to SMTP, an email message is typically sent in the following 
manner, A user 1040 (located at a personal computer or a terminal de vice) runs a UA 
program 1 041 to create an email message. When the User Agent completes processing 
of the message, it places the message text and control infotmahon in a queue 1 042 of 
outgoing messages, This .queue. Is typically implemented as a collection of files 
accessible to the MTA, hi some instances, life .message may he created on a personal 
computer and transferred to the queue nsing methods such as the Post Office Protocol 
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(POP) or interactive Mail Access Pr^oeol (iMAP). 

The sending network will have- one or mote hosts that ran a MTA 1043, such 
as Unix sendmalt by Sendmaif hm of Caliitoia or Microsoft Exchange. By 
convention it establishes a Traa^s^o»;D>atolT^toeo>l (TCP) connection to the 
reserved SMTP port (TCP 25) on the destination host and uses the Simple Mali 
Transfer Protocol (SMTP) 1 044 to transfer the message across the internet, 

T he SMTP -session between the sending and. receiving MTAs results is the 
message being transferred from a queue 1042 on the sending host to a queue 1046 on 
the receiving host. When the message transfer Is completed, the receiving MTA 1045 
closes the TCP connection used by SMTP S the sending host 1 043 removes the 
message from its mall queue, and the recipient 1 048 can use his/her configured User 
Agent program 1047 to read the message in the mail queue 1046. 

Figure 2 is a graphical representation of an example of the SMTP messages 
sent across the Internet. In this example, sendet@remotodom sends a message to 
user@escom.com (The top-level domain name -'com" does not actually exist, and is 
used for illustrative purposes only to avoid referring to a example domain), 

The sending host's Message Transfer Agent 1001 sends an email message to 
the receiving host 1002. At step 1010, dje sending MTA opens a TCP connection to 
the receiving host's reserved SMTP port. This is shown as a dashed line with an 
italics description, to difterentiateit &orn die subsequent protocol messages. This 
typically involves making calls to the Domain Name System (DNS) to get the IP 
address of the destination host or the IP address from a Mail Exchange (MX) record 
tor Ac domain. For : exs^l^-fee4^Mia.^c^m<cdm-h^-a single MX record that lists 
the IP address 192,131140.3, Other aer^mts, part icolarly large Internet Service 
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Providers (ISPs), might have :t»dH:^feMK:iet^r4s that define a prioritised list of IF 
addresses to be used to send email to that doaiata. 

The sending MTA typically establishes the connection by; (!) making a socket 
system call to acquire a socket (a stmeture used to manage network communications); 

(2) filling in the socket stmcturewlth the destination IP address (e.g., 192. 135 J 40.3 ); : 

(3) defining the protocol family (internet) and destination part number (by convention, 
the MTAs use the reserved TCP port 25); and, (4) making a connect system call to 
open a TCP connection to the remote MTA and returning a descriptor for the 
communications channel 

The process of opening a TCP connection causes the receiving host's 
operating system (or networking software) to associate the TCP connection with a 
process that is listening on the destiMtton TCP port. The TCP connection is a 
bi-directional pipe between the sending MTA 100! on the sending host and the 
receiving MTA 1.002 on the receiving host SMTP is line-oriented, which means that 
all protocol messages, responses, and message data are. transferred as a sequence of 
ASCII characters endi ng with a line feed (newline) character. 

In step 1011. the receiving MTA sends a service greeting message when it is 
ready to proceed. The greeting message typically gives the host name, MTA program 
and version number, date/iime/dmemn« f and perhaps additional information as 
deemed by the host administrator. The greeting lines begin with the three-character 
numeric code "220". By convention, the Jast/onry line begins with the fear-character 
sequence "220 " and any preceding lines begin with "220-", 

When the greeting message is received, the sending MTA may optionally send 
a -HELO message, step 1012, that lists its host name. Some mail servers require the 
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sending host to t&t&ffcte message, and others do not. lithe client (sending) MIA 
issues the HELO message, then the server (receiving MTA) issues a HELO response, 
step 1013, that lists its-mme,. : F0r:E#m<iMi: : $^PrP : (ESMTFL the sending host sends 
m EKLG message that performs essentially the spue function as the HELO message. 
In this case, the receiving host generates, a rnul!i~h'ne reply listing the extended SMTP 
commands that it supports. 

At step 1 014, the sending MTA sends a MAIL From: message to identify the 
email address of the sender of the message, e.g., seiuiei@remoie.dom. By 
convention, the Internet address is formed by concatenating the sending User's account 
■mam* the "{5f sign* and the domain name of the sending host. The resulting address 
is typically enclosed in angle-brackets, however, this is not usually reqihred by the 
recei ving mail server. It is noted that spammers can easily forge the MAIL address. 

At step \ 01 5, the receiving mail server sends either a "250" response if it 
accepts the MAIL message or some other value such as M SSO t! , if the message is not 
accepted. The receiving mail, server may reject the address for syntactical reasons 
(e.g.. no "(of sign) or because of the identity of the sender. 

At step 1016, the sending MTA sends a ECP'F To: message to identify the 
address of an intended recipient of the message, e.g., oser@eseom.eonu Again, this is 
a standard Internet address, enclosed in angle-brackets. At step 1017,. die recei ving 
server repl ies with a "250" status message if it accepts die address, and some other 
value if the MAIL messaged m&--i&^t<^$Mmimpl& i . sendmail 8,9.3 issues a 550 
message if the specified recipient address is not listed in fee password file or alias hst 
The sending MTA may send muMpleRCPT messages {Step iOih), usually one for 
each recipient at the destination domain. The receiving server issues a separate "250" 
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or "550" response as shown In step 101 1 M-^kmiipimt 

At step 101 1, the sending mail server sends a DATA message when it has 
Identified ail of the recipients. T&e serve* sends a response {nominally, "354", as 
shown in step 1019) telling tfce : &^«g$e^er ta'begihsendmg the message one line 
at a time, followed by a single period when the message is complete. 

When the sending MIA receives tins reply, it sends the text of the email 
message one fine at a lime as shown in step 1020. Note that it does not wait for a 
response after each bne during this phase of the protocol. The message includes the 
SMTP message header, the body of the m essage, and any attachments (perh aps 
encoded) if supported by the sending User Agent program. 

When the message transfer has been completed, the sending MTA writes a 
single period ('7'} on a line by itself (step 1021) to inform the destination server of the 
end of the message. The receiving MTA typically responds (step 1022} with a "250" 
message if the message was received and saved to disk without errors. The sending 
MTA then sends a "quit* (step 1.023) and the receiving MTA responds with a "221" 
message as shown in step 1024 and closes the connection. 

Figure 3 shows the same Information, using a text representation of the SMTP 
messages between the sending MTA (remote.dom) and receiving MTA (eseom.com). 
The first character of each line indicates the direction of the protocol message . The 
**>" character indicates the direction of the protocol message sent by the sending 
MTA, and V indicates the direction of a niessage sent by the receiving MTA. These 
characters do net form a part of the message being nmsmiied. 

The email m^§geb«^%is-^isf®HE«d# the beginning of the message and 
extends to the first blank line. As described In RFC £22, Standard for the 'Format of 
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ARPA Internet Text Messages, the email message header includes Received: lines 
•added by each MTA that recdveltte message* %e message time-stamp, message ID, 
To and From addresses, and the Subject of the message. The message header is 
followed by the body of the message (in ibis ease, a single line of text), the 
terminating period, and the Snal handshaking at the end of die message . Here, the 
term "message" alone refers to the overall email message as well as the multiple 
protocol messages (e.g., HELO, MAIL and RGPT} that are psed by SMTP. 

Spanker TecMmieg, 

The two primary techniques used by spammers are relaying and directing 
SMTP from a dialup PC. Approximately one-half of all spam attempts m relayed 
from an attacker through an intermediate site that permits relaying. Mmy of these 
open relay sites have been recently added to the Internet without regard to good 
system administration practices, and consequently may permit relaying without regard 
to its consequences. 

Approximately one-third of junk mail is sent directly from a dialup PC to the 
recipient maiihosL The use of direct SMTP from a PC provides the ability to forge 
email As open relays are closed, this percentage is likely to rise. The remainder 
(approximately 15%) of junk mad is from users that appear to have an account on the 
sending network- 
Regardless of which technique is used, however, almost all junk mail have 
similar characteristics. Junk mail messages almost invariably have a forged email 
address in order to discourage complaints by the recinienis of the spans. Con tact 
information is provided somewhere in the body of the message, and maybe another 



email address, a link to a web gage .er.aielephone nomber. In addition, jonk mail 
frequently does not include the recipient's address ia-the header of the message. This 
is done primarily as a performance Qp&nfeation, 

fa addition, junk mail is usually sent from a %rowaway" account, m which 
the spammer sends a batch of messages (usual ly thousands of messages) and then 
moves on after being canceled. Similarly, spamming networks sometimes perform 
spam runs from & mail server, then &ke the host offline to avoid complaints. Such 
networks operate until they are widely blacklisted, then register a new domain and 
carry on business under a different name. 

Any person with an email address at an Internet Service Provider (ISP) 
account can send junk email. After acquiring an address list, the user can send a 
message to each address on the list using the mailer program provided fey the ISP, 
However, as shown in the examples in Figures 2 and 3, most ISPs record the sender's 
actual email address in outgoing message .headers. If recipients complain, the ISP will 
often temnnaie the user's account, sometimes hilling cleanup fees in- accordance with 
the network's Acceptable Use Policy (AU.P). Consequently, this technique is not 
favored by most spammers. 

Relaying is not inherently bad. Early mailhosts relayed as a matter of courtesy 
and convenience for system adnunisfratots to test their mail systems.. In audition, 
most networks relay internally so that not all network hosts have to be able to handle 
Internet mall Small network subscribers often relay through a "smart host" provided 
by their JSP that is eonfigurM to handle the more complicated aspects oflniemet 
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mail This arrangement is inientimial and usually is not abased 

The problem occurs when ahe^st rndi^nmrnaiely relays mail from any domain 
to any other domain. These hosts are known as "open relays". The practice is 
sometimes referred to as H thtr£l*paity n5laying ,s ! sinee d*e rel ay host is neither the 
initial sender o f the message nor the intended recipient. 

Open relays permit the spammer to easily forge his/her identity. Figure 4 
shows how a spammer at spam.dom .1.060 relays mail via relay.dom 1 061 to a variety 
of di fferent users at different target domains 106:2, 1074, etc. At step 1063, the 
spammer connects to relay.dom, as described with regard to Figure 2. For clarity; 
SMTP responses (greeting messages, 250, etc) are not shown in this figure* 

At step 1064, the spammer forges a MAIL From message listing an address at, 
the open relay host 1 061 , The forged MAIL address can b® at any network including 
spam.dom, re!ay,dom, any of the netmdom host%. ear Somewhere else. The forged 
MiAIL From : address may be the same as the From: line m the message header, or it 
may he different At one time spammers commonly forged addresses at AOL.COM' or 
other large networks, because those networks were so well known, but legal action by 
AOL in particular has largely stopped that practice. The spammer is able to forge the 
MAIL address usually because he or she Is able to override the norma! user 
authentication functions, perhaps as a trusted user of a network server or as the 
operator of a single-user PC. 

At steps 1065^ 1066, the spammer sersds multiple RCPT messages with a list 
of destination addresses. Finally, step 1068, me spammer sends a DATA message, 
the text of the email message, a period, and a onit message to reiaydo.rn , When 
relay.dom receives the message, it stores die message in its mail queues until it has 
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forwarded the message to each of the target addresses, or until the message has timed 
out. If it cannot deli vera message, it will typically retry periodically (perhaps every 
! 0 minutes or perhaps once per day). The relay host will usually keep undelivered 
messages in its queue for up to a week. 

The resul t is that spam,4om will send (fee. message-once and the relay host 
1061 will forward a copy of the message to each host 1062, 1074 in the address list. 
For example, relay.dom will open a connection 1070 to host net! ,dom 1062, send the 
MAIL message 1 07 1 , send the ECFT message 1 072 , arid then send, the text of the 
message 1073, The relay host 1061 repeats this process for host nef2,dom 1074, as 
shows by steps 1075-1078,; and any remaining target hosts (not shown). If Spam.dorh 
listed 100 different hosts in the RCFT addresses it sent to rekydom, then relay jom 
will attempt to send the message 100 times. 

The difficulty hi filtering relayed junk mail is shown in part by this example. 
If the spammer 1000 forges the M AIL From address to match the relay host (e. g. , 
5, gO0d@telay.dom") then as observed by net! doro 1062, the message appears to he 
from a legitimate user at relay, dom. This example shows abuse of one open relay. 
The current generation of relaying tools will also permit the spammer to enter a list of 
open relay hosts, and the software will use different relays for different groups of 
addresses. Thus, different t^^ mH«-.^«ia*gs* network may receive spam relayed 
via different paths. 

lite ,p«taiB^ : t«datiqtse-iti : hlodctni.i«kyed : '^^ involves databases of 
blacklisted IP addresses, which Can fee consulted by spam filtering software to 
determine whether the sending host is M open relay. For example, sendmaii 8,9.3 
provides an option to took up the IP address of the sending host in such a database. 
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and res ect the mail If the ^atafesse Mcaies mat the IF address is an open relay. 
Examples ofonKne blacklist dat^>^^.l96to^:feriastos5e,- the Mail Abuse 
Prevention System (MAPS) ReaMme Blaeldmle List (RB.L) and the Internet Mad 
Relay Services Survey (1MRSS), 

The problem with snob blacklisting databases is that they are static rather than 
dynamic. Consequently, as open relay must be abused at least once, reported to the 
database, confirmed by the database organization, ihea added to the database, before it 
will be blocked. Because database methods, are static, the entry for a host past be 
manually removed when the host's .mailer is fixed so it .no longer relays. This takes an 
exchange of messages, re-testing, etc. In addition, these remote database methods 
involve connections to the database server, a lookup on that server (which may be 
doing lookups for hundreds of other users). Because these dat abases are global, they 
are not under control of local administrators. That is, if an organization has a 
customer that has an open relay, then the organisation must either stop using 
blacklists such as MAPS or IMRSS, or risk having mail from the customer blocked 
because of an entry in the M APS or IMRSS databases. 

These database organizations typically take referrals from administrators 
throughout the Internet for open relay addresses, The organization then typically 
verifies the relay status before placing the address in the database. In the general case, 
an open relay can. be corifittned-1syittepptt»g,to send a message from user A to user 
B, using the candidate relay address as an intermediate forwarder. The relay host may 
in turn relay the ^$^ge'^^a^:addi|oi^'hb$^'m its network (known as 
"malb-hop" relaying), before sending it to user B, Mosvever, If user B eventually 
receives the message, then the host must have relayed. 
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A much simpler test can he performed hy mmply telnetiug to the SMTP port 
(TCP 25) of the suspected ofKsaMayi-ihm ^iftg j :2». SMiP-'-commaads as indicated in 
the ">" sequences m Figure 5 and obseniftg the »spoases indicated In the 
sequences. If the two networks are unrelated (ie., the remote host is not acting as a 
legitimate smart host for the local organisation) and the suspected relay host .returns a 
"250" response to the RCPX message, then the remote host probably is an open relay, 
After the response to the RCFT message is recei ved, the testing host can close the test 
connection without -actually sending any data. However, this test Is not perfectly 
accurate, as it fails to identity multi-hop relays. There ate also some hosts that give 
"230" responses to the HCPT message, hut actually, reject the relay attempt during 
later mail processing. 

PC-based SMTP Direct 




Figure 6 shows how a spammer can use a dialup PC 1 080 running a SMTP 
direct program 1081 that is able to establish SMTP connections 1044 directly to the 
SMTP port of the target mailhost. The term "dialup" as used herein refers to a class of 
Internet subscribers characterized by an inability to service incoming mail requests 
(be,, not a mail server), having a related If hot sequential name space, often using 
dynamically-assigned addresses, and generally existing at the lowest tier of pricing 
offered by an ISP. it includes various means of connecting, not all of which involve 
literally dialing in to the ISP, tor example, wired cable or pocket radio. The spammer 
typically provides a single copy of a message 1082 and a list of addresses 1083. The 
program establishes an SMTP connection 1044 to each remote MT A 1045, delivers 
the message, and proceeds to thenext entry in the address list. 
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Because ihs Dlaiap SMTP Direct program 1081 mas nnder the control of the 
spammer, the program can be configured ■-.to forge any email address, hostname, or any 
field (e.g., fee From: address) in the message header. Conseqoen%, a message 
received by a user 1048 that ts sept fey this means may appear to be seat by a 
co-worker, from one's manager*: &om lieods on another network, or even by the 
recipient himself 

The primary method for blocking junk mail fiom SMTP Di rect hosts is by 
using centralized blacklists. These include the- MAPS Dial op User List (DDL), The 
DUL lists various blocks of IF addresses that are known to be used for dial up 'PCs. 

.Current Solut bm 

The solutions that are presently available to block junk mail fall into seven 
general categories. First, the use of centralis blacklisting databases, such as 
described above for the RBI.., I.MB.SS, and DUL, Second, the use of local Maeklisimg 
databases, such as sendmail cheeking a local database and blocking email that 
snatches entries in the database. Third, blocking mail from nonexistent domains, such 
as for instance If sepdmail receives "MAIL From: <sender@nenexistenbdom>", it- 
will reject the mail because It cannot fmd the domain "nonexisteatxlom" listed in the 
Domain Name System (DNS). 

Fourth, whitelistmg methods are used, so that a filter can reject all sender 
addresses that are not included in a local whhohst of permissible addresses. Fifth, 
Bee filtering may he used to-?i^e^.m5iU-^M : -«WaK>%vn hosts that do not list the 
recipient's email address in the header of the message. Aral sixth, client methods may 
be used to reject jonit mail located in the user's mailbox without downloading fee 
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mail to the user's mail program (IJA), Filtering of client protocols such as POP 
provides relief to Individ ind nsers, bat still siiows juak fflail tobe stored on the SM TP 

server. 

Seventh, secure electronic mail, saeh as based on the emerging 
Sec^e/Multipurpose Internet Mail Extension (S/MIME) and OpenPGP standards uses 
public key cryptography to provide security services such as secrecy (confidentiality),, 
integrity (ability to detect modification)* -authentication, and non-repudiation. 
Spammers are unlikely to use integrity and non-repudiation sendees, in particular, 
since these involve a digital signature -signed with the sender's private key. However, 
these systems do not provide a solution to spam, since net everyone uses public key 
cryptography. Further, these services typically operate as part of the User Agent^ so 
S/MIME or ppepFCSf -protected spam can still be relayed or sent iroin dislup 
computers. 

SUMMARY OF THE INVEN TION 

It is therefore a primary object of the invention to provide an email fil tering 
system and method, .ft is another object of the invention to provide m email fi ltering 
system that substantially eliminates security risks and loss of company resources 
associated with jnnk mail It Is another object io provide m email filter that operates 
at the MTA level and performs active filtering based upon characteristics of the 
incoming connection and tiie remote host 

In accordance with these objectives, m Active Filter proxy in accordance with 
a preferred embodiment is provided in a conventional firewall configuration between 
a remote host and a local MTA, The Active Filter proxy probes the sending host at 
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the time it connects and implements a series of teste to determine if the remote host i s 
likely to be either a dialup csst«msr (Aettye Dialup testing), or m open relay (Active 
Relay testing). It also queries the mail server that handles email to the supposed 
sender of the message m determine if ttemail server will accept email for that address 
(Active User testing). Together, these tests address the primary sources of junk mail 
These tests reject SMTP email based on characteristics of the received SMTP 
protocol fields and the configurstioa of the remote host The Acti ve Dialup test 
considers certain characteristics typical of dialup PCs, which include the inability to 
operate as & server and generally a sequential naming scheme, The Active Relay test 
concludes that if the remote host appears to relay for a test connection, then it will 
probably relay for spammers. The Active User test detects obvious forgeries by 
blocking entail where the configured mailhost for the sender will not accept a reply to 
that address. 

Because these techniques are performed at the time of the initial SMTP data 
connection, they characterize the remote host as it Is configured at that time, thus 
avoiding the latency problems of static blacklisting databases. Further, rejected mail 
does not consume any disk storage on either the proxy host or the niaiihost. Instead, 
the rejected message remains on the remote host whether an open relay or dialup PC, 

Thus, junk mail that is blocked by at least one of these tests does not make it 
onto the local mail server or user clients. Consequently, it cannot be used to propagate 
viruses or other malicious code and i t cannot distract the intended recipient from 
his/her work. The Active Filtering proxy can be chained with other content filtering 
proxies in a conventional fashion to reject oiwr ohjechonabis or malicious content in 
the bo dy of the message. 
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Minims! involvemffii is quired by ©mail a^imstrators, when compared 
with the administrative cost of removmg juftk laasi from mail servers, cleaning up 
after a virus or other malicious code attack, complaining about junk mail, and solving 
other problems.- Administrator involvement -generally consists of reviewing logs and 
adding IP address blocks anddomain names to trusted databases where necessary. 

It is not practical, and pertxaps not possible, to blacklist ail current md future 
sources of spam, or to whitelkt all benign sources of legitimate email, because the 
Internet grows and changes so quickly. However, it is readily possible for most 
administrators to define the relatively few (perhaps tens or hundreds) trusted domain 
names and to rely on the Active Filtering methods to characterise the remainder of the 
hosts that connect. 

The method also provides the ability to automatically append IP addresses 
detected by certain sensor points hack into the W filtering list, so that those hosts can 
be subsequently blocked by a simple W lookup mechanism. Tins provides a 
performance improvement by quickly 'rejecting subsequent connections from IP 
addresses that, have already been rejected by one of the Acti ve Filtering tests. 

The present invention is compatible with all known SMTP MX As. The 
architecture permits a natural separation of responsibili ties for the proxy and the 
MTA, The proxy offloads the rejection of junk mail, so that the MTA need only 
consider legitimate email. The MTA may provide other con ventional spam-filtering 
methods of its own (for example, rejecting non-existent MAIL From domains) or may 
reject mail because the RCPT user does not exist on the local network. 

These and other objects sf the inveMlon, as well as many of the intended 
advantages thereof, will become more readily apparent when reference is made to the 
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following description taken m ^jtt^Hen'witkfea^mj^yi-ng drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 depicts the general architecture for Interne* electronic mail using the 
Simple Mall Transfer Protocol (SMTP). 

Figure 2 is a graphical representation of an exchange of SMTP protocol 
messages involved m transferring a single electronic mail message from- one MTA to 
another. 

Figure 3 is a printout of the message of Fig. 2, showing the protocol 
transactions, message header, and message body. 

Fi gure 4 shows how a bulk mail program takes advantage of an open relay 
host elsewhere on the Internet to store a single message and a list of addresses, 
causing the relay to forward the message to each address In the address list at recipient 
MTAs. Spammers typically use relaying to offload processing .from their computer 
and obscure their involvement in sending the message. 

Figure 5 shows the SMTP messages used to perform a simple test of a remote 
host to determine if it is an open relay. 

Figure 6 shows how spammers may transfer mail directly from a SMTP direct 
program on a personal computer to the input port of a MTA. Spammers typically use 
this method to make message forgery easier and to avoid their network's controls on 
outgoing email. 

Figure 7 is a block diagram of the Active Filter proxy server system in 
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accordance with the preferred embodiment 

Figures 8-12 show specific architectures in accordance with the present 
invention when deplo^-wi^^^-ep^ipre^^iags^te^. 

Figure 8 shows the general architectine, in which the Active Filtering proxy is 
connected in a preferred embodiment as part of a fire wall between the Internet 1100 
and the organisation's MTA. 

Figure 9 shows the proxy and MTA residing on the same computer. 

Figures ID and. 1 1 show the present .invention implemented as part of a SMTP 
wrapper process or m the MTA itself, respectively. 

Figure 12 shows how a proxy may he chained with a content- filtering proxy 
for enhanced control, over incoming email 

Figure 13 shows an overview of the protocol transactions exchanged in 
transferring a single email message from a remote host, the Acti ve Filtering proxy 
server, and the protected MTA. 

Figures 14-23 show the details of the protocol interactions and processing flow 
for the transfer of a single email message from a remote host 1400, through m Acti ve 
filtering proxy server 1401, to a local MTA 1402. 

Figure 1 4 shows the initial connection from the remote host to the proxy, a 
blacklist check, and display of a greeting message to the remote host 

Figure IS shows the processing of the remote host's HE 1,0 and MAIL 
transactions by attempting to open a revere test connection 1418 to the remote host 

Figure 16 shows the general framework for the Active Dialop test 
Figure 17 shows details of the preferred embodiment for a sequential -name 
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check used in the Active £8abip test 

Figure 18 shows the Active Relay test. 

Figure 1 9 iilostrates the Acll»e'Us^'venll<^tl8s:;metiiod« 

Figure? 20 shows how the. -proxy opens a connection to fee local MTA 1403 to 
transfer a valid message. 

Figure 2 1 shews the '-.transfer of fee Mm m the email, message (header, body, 
attachments, etc.) and how fee correction is closed. 

Figure 22 iffostrates an alternative embodiment for the Active Dialup test 
based upon edit distances between fee remote host name .and its neighbors' names. 

Figure 23 shows a second alternative embodiment for the Active Dialup test 
based upon the inability to establish reverse test connections to neighbors of the 
remote 'host. 

Figure 2*4 is a block diagram of the Active Filter proxy server system hi 
accordance with the alternate preferred embodiment having an optional per-rccipietfe 
whitehst database and quarantining. 

Figure 23 is an overview flow chart, showing the processing of the MAIL From 
message with respect to the embodiment of Fig, 24. This incl udes the Active Filtering 
method's describes.! in Figures 1549, however, sn mrcenient of fee decision is made 
separately for each subsequent recipient identified in an RCPT message. 

Figure 26 is an overview flow chart of per-MCPT white-list processing for an. 
individual recipient The proxy connects to the local MTA after the first authorized 
recipient is identified. 

Figure 2? shows how the proxy quarantines a message that did not pass Active 
Filtering and is not whltelisted for ibe current recipient. 
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Figure 28 shows the processing of the remainder of the email message, 
beginning with the DATA transaction. An email message can be transferred directly 
to one group of recipients and: also %\i^ati»^-f0rfiie.tataatnder of recipien ts, 

Figure 29 stews the retrieval of a quarantined message by a user or 
administrator, with the proxy transferring the quarantined message to the MTA as it 
would my other valid message. 

DETAILED DBSC.RJF.nOH OF THE PREFERRED EMBODIMENTS 

in describing a preferred embodiment of the invention il lustrated in the 
drawings, specific terminology will he resorted to for the sake of clarity. However, 
the invention is not intended to be limited to the specific terms so selected, and it is to 
he understood that each specific term includes all technical equivalents which operate 
in a similar manner to accomplish a similar purpose. 

^yc'htotares 

Figure ? illustrates the design of the Active Fihenng proxy server. The server 
runs as a process 1 104 on a host computer (preferably a firewall host 1 1 03 as shown 
in Figure 8} f interposed between remote hosts on the Internet 1 100 and a roailhost 
1 10$. The proxy design requires sendees provided by the computer hardware platform 
1091 and the operating system 1000, The hardware platform 1091 ineiudes one or 
more processors, memory, disk smmge, and network interlaces. The number of 
processors and amount of memory required depends ispou the anticipated processing 
load. A small network ^ rniglit shffiee with a single processor and 32 megabytes of 
random access memory {RAM}, white a larger network might require multiprocessor 
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implementation with hundreds of megabytes of BAM. The hardware platform 
provides disk storage for foe program »sed to Implemeni pr&xy I 104, operating 
system 1090, various conllgnra1ton #a:t8bases {1093 - 1098), and a bgfile 1009, Two 
network interfaces are prefesM if fee mailhost 1 105 is to be hidden behisd a firewall 
The platform may also include a console (not shown) for configuring and controlling, 
the .server, however, this may also he performed via the network. 

The operating system 1090 provides an -execution environment for the proxy 
process of proxy 1 104 using the hardware 1091. It provides Transmission Control 
Protocol (TCP) socket services. Domain Hame System (DNS) services, file system 
services, memory management services, and logging services, in modem operating 
systems (such as Solaris, Linux, Ai.X, and Windows NT), the tile and memory 
management functions cooperate to provide access to a virtual memory space that 
exceeds the amount of physical memory available. Since the program Image of proxy 
1104 and ail of the eonfignration files except the blacklist database 1095 and log 1099 
are read-only, these may be read in once from, disk then subsequently accessed from, 
virtual memory. 

The operating system also provides the abstraction of TCP sockets .1092 and 
1089. Each socket identifies a remote- host endpoinb such that the socket 109:2 is 
associated with a remote host (shown subsequently as Figure 13 item 1400) and the 
socket 1089 is used to control communications with the local Message Transfer Agent 
1402 (Figure 13), An additional socket (not shown) is nsed for each test connection, 
e.g., 1418 (figure 1 5) or 1903 (Fig, 19). The operating system also provides a means, 
such as the UNIX Internet Daemon (metd), for dispatching programs (such as the 
proxy 1 104) when connections are received from the Internet 1 100. 
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Configuration datab8is«S''iapk<i«'t^st«ES DB 1093, which is ased to Identify 
trusted networks that are permitted to bypass further filtering; Whitellst DB 1094, 
which contains individual emaii asid^es-ftst i^$&tagftteci'lo bypass fiirther 
filtering; Blacklist DB 1095, which idenfifies IP address of remote hosts that will he 
blocked immediately after they connect to the proxy server; Relay DB 1096, which 
contains configuration data for the Active Dialup filter, including addresses of 
unirusted hosts that are known not to he dtalup clients; Dialup DB 1 097, which 
identifies untrusted hosts that are known not to he dialup clients; Configuration DB 
1 098, which includes general data such as the IP address and port for the. Maiihpsi 
1 103, permissible domain names for RCPT messages, etc: and System Log 1099,. as 
typically provided by the UNIX syslog .facility or Windows NT 'Event Log service. 
The preferred embodiment is for each database to he provided as a separate file. 
However, alternative embodiments may provide for merging some or ail databases 
into a single configuration database, however preferably excluding me Log 1 099, 

Further to the preferred embodiment* the Active Filtering Proxy 1 104 is run 
once for each incoming connection received horn the internet 1 190, reads the 
configuration databases 1093-1098, interacts with the remote host to determine i f i t is 
likely to he a source of junk mail, and -either closes the connection (without any mail 
being transferred) or opens a eoimeetioa via the socket 1039 to permit the remote host 
to communicate with the maiihost 1 105. In either ease, the proxy writes one or more 
log entries to the Log 1099. The ■■p^xy"do^.#f say^theinessage to a local rile but 
instead per forms all transfers from memory feuifors. That is, the proxy receives a 
SMTP message from socket 1092 into a memory hufier, opt ionaily validates the 
contents of .the buffer, and dsen sends the contents of the buffer via socket 1089 to the 
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mailhosi I 105, to the preferred embodiment, the proxy exits after processing each 
message; however, aItm»^e-mxbQ#»eafe may pmwMim a single proxy itmt 
simultaneously handles muKipleinessages, 

Figures 842 show five Mcmteetnres that provide Active Filtering of junk mail 
received from the Internet i 100 and addressed to m MTA 1 106 on the organization's 
mail server 1 105. The Active Filtering proxy 1 104 {Fig. 7) runs on a separate firewall 
host I 1 03 (Fig. 8} or on the same mail server host 1 107 that is running, the MTA 1 106' 
(Fig. 9% The methods can he implemented as part of a mail wrapper 1 1 10 (Fig. 10), 
wvm he integrated as part of the MTA 1113 itself (Fig, 1 1), if it rims as a separate 
proxy server 1 ! 03 (Fig, S), it can he chained with other proxy servers 1 11 € to 
implement snore complex filtering policies (Fig, 12), 

The organisation's network includes, at a minimum, a renter I 101, Internet 
eopneclion 1180, Lo^gl .&*e& Nctw<M* : {p&0 1102 and MTA 1106, Accordingly, 
these components are common to ah of the architectures shown in Figs, 8-42, The 
packet-filtering router 1101 routes packets from the Internet 1 100 to the SMTP proxy 
server via the LAN 1 1 02. The router operates at the network layer of the protocol 
reference model using the Internet Protocol version 4 (IPv4). However, with 
appropriate changes to the socket programming interface, the present invention also 
operates with other network layer protocols such as Internet Protocol version 6 (IPv6) 
or Novell Netware. 

Internet conneerion 1100, which is feetween the router 1 101 and ex ternal hosts, 
is typically provided at tfee physical fa by wired or diaiup 

circuits (such as diaiup modem, ISDN, ADSL, or cable TV) using link-layer protocols 
such as Foint-to-Pomt Protocol (PPT*) or Single line feisdaee Protocol (SLIP), The 
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present- invention operates at the applkatiosi layer a?si is independent of the Internet 
connection. 

As with Figure 7, the firewall host 1103 has two separate LAN interfaces 1 102 
and III?. LAN 1 1 02 interconnects the Internet: ILO0 with the firewall host 1 103- 
LAN ill? connects the firewall host 1103 with the organization's protected. servers 
(e.g., 1 105} ami workstations (not shown). These LANs are typically Ethernet or 
Token Ring technology. However, die present invention is independent of the type 
of LAN technology (adapters, device drivers) used by the organization. 

Each architecture of Figs. 8- 12 has one or more MTAs 1 106. These incbKie 
programs such as Unix sendmaiL Microsoft Exchange, Netscape Messaging Server;, 
Lotus SMTP, Apple Internet Mail Server, special-purpose SMTP servers used by the 
various Internet service providers, and other MTA programs. These MTAs may 
themselves perform some degree of junk-mail, filtering (for example, automated 
searches of the MAPS Realtime Blaekhole List (RBL),MAFS Dkhip User List 
(DLL), 1MRSS relay list, or other databases. However, except as described for the 
Active Filtering MTA 1113, the methods described by the present invention are 
compatible with but do not depend on the choice of MT A 1 106 or the MTAs 
spam-filtering mechanisms. 

Turning now to Figure 8, the Active Filtering mechanisms are implemented as 
an SMTP proxy server prncess 1104 on a dedicated firewall host 1103, as is typical in 
a firewall amhitectnre. The proxy server is identified as a Mall Exchange (MX) host 
in the DNS information for the local prputeatioft and has fcoand port 25 so that all 
connections to port 25 of host 1 103 will he directed to the proxy server process of 
filter 1104, 
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Thus, when a^r^Etote^t-^empl^to^s^.tJjail to a user at the local network, 
the remote host gets the same of the proxy gereer foru the MX record, translates the 
name into m IP address, acquires a socket, and opens a Trammissio« Control 
Protocol (TCP) connection to port 25 of msiif ewall host 1103, in accordance with 
standard socket programming practice, li the active probi ng of the remote host 
characterises it as unlikely to be a source of spam, then the proxy server process 1 1 04 
opens a connection to port 2S of the nutnhost ! 105 (which has li kewise been bound by 
the MTA U0(>), transfers the initial protocol messages to the MTA I 106, and then 
transparently passes data to the MTA 1 j 06. 

The router 1 1.01, firewall host 1103 and mail server host 1105 can also be: 
installed, on a single IAN 1 1 02. ft this ease, the firewall host 1103 has a single 
physical LAN interface device that is shared hy the two logical interlace fractions' 
(message amyal f Via the router 1101, and message deli very to the mail server host 
1 105), The use of a shared physical LAN interlace is conceptually the same as shown 
in Figure 8, with the exception that the firewall host 1 103 cannot be configured to 
block packets from the Internet 1 100 to the mad server host 1 105, In this ease, the 
router 1 101 must, be configured to block such direct access from the internet to the 
mail server host 1105, 

With respect to Figure 9 } the same Active Filtering proxy 1. 104 runs as a 
process on the mail server host T1Q?< The proxy 1104 performs the same functions as 
it does when it nms-:o«.^^^1te'Sriw8H.'h<^t:0.ie., Fig. 8), except that it does not 
necessarily need to establish an SMTP eouueetlon to the MTA 1 100, Instead the 
proxy server 1104 may use any a^t^leMi^l^^' : ©£B3ai»UBicatio8S (IPC) method 
11 OS that is prov ided by the mail server host 1107, 
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Depending oa the operating system, there are several alternati ves for 
communicating with the Mf& 1106. A first alterative, formstauoe, includes a TCP 
Connecticut to some port other than TCP 2& Fort 26 is used, although any other TCP 
port could be .used. The renter 1 101 must then be configured to prevent packet 
communications directly to the ^iected portXe.g., 26), 

A second alternative with Unix hosts running sendmaii, is for the proxy 1 1 04 
to be configured to save the message into a tile, and pass the tile to sendmail using the 
sendmail command line interface. Still yet, in a third alternative with Unix hosts, the 
proxy 1 104 could use a Unix domain socket, named pipe, or other mechanism thai is 
supported by the local MTA 1 1.06, 

In Figure 1 0, the Active Filtering technology may fee included as part of a 
MTA wrapper program i 1 10, far example, the Trusted information Systems (TiS) 
Firewall; Toolkit (FWTK) sendmail wrapper smap program. The smap program: is 
essentially an SMTP proxy, but its primary function is not to block junk mall but 
rather to protect the sendmail program from attacks (such as unauthorised use of the 
DEBUG option), from stack overflow attacks, and from other external attacks on 
sendmail. Thus, the purpose of a wrapper 1 1 1 0 is to protect the MTA i 1 06, and 
Active Filtering is an ancillary ftmcfion. Various IPC methods 1 1 08 are possible, 
although the FWTK smap program uses the sendmail command line interface. There 
is no need to do any special packet filtering, since the interface to sendmail is sot 
visible to -remote hosts. 

As shown in Figure 11, the Active Filtering technology I ! 12 could be 
Implemented as part of a standard MTA 1 106, resulting in a special MTA 1113 wi th 
Acti ve Filtering, 
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As shown m Figure 12, fee Ae%e;-Fil^sg.pmsyll^-(on firewall host 
t ! 03) can be chained with other ftm^;smm. VH6 :(oa firewall hosts 1 1 .14) to 
perform other mail filtering functions. For example, various products, such, as in 
accordance with U.S. Patent No. 5,623,600, provide filtering of viruses and other 
malicious code. Preferably however, the Active Pllteriog proxy 1 104 is the firs? host 
Sin the chain of proxies, thai is, closest to me Internet, so it is best able to determine the 
essential characteristics of the remote host that is attempting to send email. The two 
•filtering proxies 1 1 04 and 1 1 V6 provide improved filtering by requiring each message 
to pass through both filters before it can be accessed at a client workstation. 

Provided the Active Filtering proxy has Ml access to the remote host, other 
configurations are possible. For instance, the Active Filtering and content filtering 
proxy servers (as well as the MTA) may ran on the same proxy host, lii addition. 
Invocati on of the content filtering proxy may use a means such as the Content 
Vectoring Protocol (CVP), rather than by serially linking the two proxies. This 
architecture permits additional proxies to be added to the chain, for example, proxies 
having other spam detection inechauisois or other content filtering techniques. 

With respect to Figures 8-12, there does not necessarily have to he a 
one-to-one relationship between the uumber of Active Filters and the number of 
MTAs within an organization. For example* in Figure 8 based upon performance and 
loading considerations, tiiere might he three firewall hosts 1103, each connected to the 
LAN 1 102, each running an Active Filtering Proxy process 11 04, each having its own 
unique IP address, and each being eOnfigured as a MX host within the organization's 
DNS database. All three proxy servers 11 03 we^ connect to lbs MTA 1 106 only 
when they have legitimate (non-dialup, non-relayed, non-forged) email to deliver, 
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While the individual actiw'il!tMn^' : p&(^s^'th^^!$^ fesdtvs additional time and 
computing resources, these offioad iie--|»$ces^:sij>aJk mail In such -a way as to 
reduce the overall load on the MTA 1 105. 

Within these various arehheetweSj Active Filtering operates primarily as a 
server with respect to the initial connection (mm -the remote host. SMTP is a 
client-server protocol, in which the remote host (client) issues requests to the local 
host (server). Although the remote host initiates the connection and each subsequent 
protocol exchange, transfer of the message is under control of Active Filtering proxy 
1 104, which may decide to reject a particular SMTP transaction or even disconnect 
from the remote host. The Active Filtering proxy 1 104 (and its implementations in 
111 0 and 1 1 12) provides for acti vely probing the remote host with a reverse SMTP 
connection to identify certain characteristics of the remote host thsi historically have a 
high correlation with sources of junk .mail. 



Operation 

Figure 1 3 provides an o verview of the present invention, with, more detailed 
operation shown in Figs. 14-2% The figure shows the key steps used by the Active 
Fitter Proxy 1401 to validate a single email message from a remote host 1400 and 
transfer the message to the protected MTA 1402, A separate SMTP connection 1418 
is used for actively probing the remote host in order to perform. Acti ve Dialup 1420 
detection and Active Relay 1450 detection. An additional connection may fee 
established to a different rnailhost fer Active User testing. The Active Filter Proxy 
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1401 corresponds to ptw$ 1 104 shows In Fig, T> 

The proxy 1401 is shown ia Fig. 13 ^j^^.Mw««s-'the remote host 1400 
and the local MTA 1402. The proxy 1401 and MTA 1402 maybe located at separate 
hosts, as shown in Figures 8 and 12, or at a same test as shown Is Figures 9-1 L 
Because the proxy 1401 controls when it reads data on the connection 1403, it Is not 
possible for the remote host 1 400 to proceed with transfer of its message until the 
proxy 1401 completes its filtering. The proxy only handles incoming email and does 
sot process outgoing email Scorn the MTA to remote hosts. Outgoing email is sent 
directly from the MTA 1402 to the network. 

With respect to Internet standards, the present invention. may be implemented 
without any changes to SMTP of any other protocol Rather, this method uses 
multiple SMTP connections, appropriately timed to permit the proxy server to 
characterize the remote 'host 1400. Thus, the SMTP connection 1403 Is initiated by 
the remote host 1400, and involves transactions 1410, 1413, 1480,. 1484, 1488, 1493, 
and 1495 . The SMTP connection 14.18 is Initiated by the Active Filtering proxy 1401, 
and involves transactions beginning at step 1450. This session is used only to acquire 
protocol responses from the remote host 1400. It does not actually send an email 
message from the proxy server 1 401 to the remote host 1400. fa addition, the proxy 
server .1 401 makes other connections to IMS name servers and, if the connection 
1 41 8 fails, may make an SMTP connection to the Mail Exchange (MX) host for the 
address given m step 1413. 

Taken toge&er, t&e jrde^ing;prafifonaed by the Active Filteri ng proxy 1401 
involves the following actions when a remote host 1400 establishes a TCP connection 
1403 to the proxy First, as shown M step 1406, the proxy server 1401 gets the IF 
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address of the remote test and compares the IP address with a database of disallowed 
addresses. If the IP address of the remote host 1400 matches an entry in the- database, 
the proxy server closes the TCP conneedQp 1403 without transferring an email 
message. This Is described in greater detail in Figure 14. 

At steps 1410 sod 1413, the proxy server processes the-HELO (optional) and 
MAIL messages from the remote host 1400, The MAIL message contains the address 
of the purported sender of the incoming message, which is commonly forged in junk 
email, Except for trusted addresses (e.g. trusted hosts orwhitehsted addresses) and 
other reverse test connections 1418 (to prevent cycles of reverse test connections),, the 
proxy attempts to open a reverse test connection 1418 to the remote server host. The 
response (or lack of response) .from the remote host dictates the subsequent processing 
flow.. 

I f the proxy cannot open the reverse connection, it may he because the remote 
host is s dtalnp workstation. Accordingly, the proxy then performs Active Dialop 
testing 1 420. .Internet sendee providers typically block service requests (such as 
SMTP) to their dialup customers -using dynamic IP addresses (e.g., assigned by 
Dynamic Host Configuration Protocol, DHCP, which automatically assigns IP 
addresses to client stations logging onto a TCP/IP network), The proxy then uses 
certain heuristics based on the name of the host sod its neighbors to categorize the 
host as a dial up or nomdiaiup. If ft eaiegohfees the host as a dialup, the proxy closes 
connections 1 403 without transferring the emaiLnieasage. Otherwise, it performs 
Active User testing of the Mail i?£chaftge4M^}4^-M-:tiie.'Ffom address given in the 
MAIL message. ActiveVM^ujp-is-d€^b^ : m^:My-wlth respect to Figures 1 6-17. 

The administrator can configure the typee of testing to he conducted by the 
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proxy. The proxy reads the eonfigurgion database 1098 to ietermiae the proper 
filtering modes. Thus, fee A8ifli^s^*;c»i-setlhe .confutation database 1098 to 
include flags for Active Dsalup filtering. Active Relay filtering on a reverse 
connection, Active User filtering. See Ilicring , aal/or to append a filter to the 
blacklist database 1095 when any fitter finds an email problem. The proxy filter will 
then conduct fee appropriate filtering for the flags set in the configuration database 
1098, but will not take any action tor flags that are mi set. 

If the reverse connection is .successfully opened, then fee proxy performs 
Active Relay 1450 testing. Umier Active Relay testing .1450, once the .reverse 
connection 1418 is opened, then, the proxy 1401 sends HELD* MAIL From, and 
EGPT To messages to determine if the remote host would relay mail for the local 
proxy. If so, then it follows that the remote host is a high risk for relaying mail Irons 
other sources. If the reverse test messages 1 450 indicate art open relay of fee rehfete 
host rejects fee MAIL Front address 1413, the proxy preferably sends an error 
message and immediately closes the connection. Acti ve Relay testing is discussed 
more fully with respect to Figure 18. 

If the results of the Active Dlalup test are negative (that m t the proxy does not 
categorise the remote host as a dlalup) or the results of the Acti ve Relay test are 
indeterminate (the proxy is unable to successfully conclude Relay testing on that 
connection), then the proxy 1491 conducts Active User testing 1 901 . Here the proxy 
identifies a mailhost responsible for processing mall to the supposed sender of the 
message and queries feat mailhost as to whether it will accept mail to that address. 
These protocol interactions are similar to those used in the Active Relay method but 
are not shown on Figure 1 3 since they do not usually involve the remote host 1400. If 
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the configured maithost for thai mMtms will-not accept a reply to the MAIL From 
address, then the sender's address is probably forged, so the proxy 1401 sends m error 
message and immediately closes the cmneerion. Active User testing is more fixity 
discussed in relation to Figure 19. 

If the proxy 1401 does not reject tfie MAIL From transaction 14.13 .following 
the Active Dialup, Active Relay and/or -Active User testing, then in step 1470 the 
proxy opens a data connection to the MTA 1 402, In step 147:2, ifaHELO message 
was recei ved in step 1410, then the proxy sends it to the MTA. In step 14?4, : the. proxy 
sends the MAIL From message (received in step 1413} to the MTA, -and sends the 
MTA response back to the remote host. This is more fully described with respect to 
Figure 20, 

Oriee the proxy 1401 opens foe connection 1470 to the MTA 1402, it transfers 
protocol messages (e.g., RCPT 1480, DATA |4S4 S message data (header 1488 and 
.body 1 493), and the dot and quit messages 1 495} to the MTA as they are received. 
This occurs transparently with foe exception of a conventional Bee filter 1491 that 
scans lines of the message header for To: or Cc: lines containing a local domain name. 
If it docs not find s«eh s header line, as is commonly dope in junk mail messages, file 
Bee filter 1491 returns m mot to the remote host and closes all connections. The 
proxy 1401 also transfers MTA 1402 protocol responses (e.g., 250, 550* not shown) 
transparently to foe remote host 1400, This is described in greater detail in Figures 20 
and 21. 

When foe message is ttmsferred sueeessfaiiy, the MTA 1402 normally closes 
the connection to the proxy 1401. which in tarn closes fee connection to foe remote 
host 1 4<KX..'fe.'suigli^t'hT3^d«d irnplemesitsfions* ttte .proxy simply exits. In 
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multi-threaded implementations*^ proxy deallocates the resources (sockets., memory 
buffers, etc.) used for the message exchange and resets internal state variables to 
indicate that the message Is no longer active. 

Figures 14-23 detail the methods and .--apparatus of the Active Filtering 
methods shown m Figure 13 , These diagrams use a combination of the protocol 
message formal favored la protocol documentation, and logic diagrams for the Active 
Filter proxy itself as is commonly used in software documentation. 

There are three primary participants throughout the protocol descriptions, 
namely Remote Host 1400, Active Filter Proxy 1401 and Local MTA 1402, Remote 
Host 1400, shown at the left in the figures, is the host that is attempting to send mail 
to the local domain. This host may he a sending MTA, a telnet session; from a user 
shell account on lite remote host, or a SMTP direct session from a nser workstation! 
Active Filter Proxy 1401 is located, between the Remote Most 1400 and Local MTA 
1 402, and is shown in the middle of the figures. Local MTA 1 402 Is shown at the 
right in the figures. The flow of mad from a legitimate host to the local protected 
MTA is shown in the figures as flowing from left to right. The system farther 
interacts with DNS name servers, as well as the Sender's configured maiihosi (Fig, 
19). 

Referring back momentarily to Figures 842, the Active Filtering design may 
be implemented in various forms. Figures 13-21 apply to an Active Filter proxy 
server process that may or may not he located on the same host as the local MTA 
process 1402, Tte systertt is apphcahle to each embodiment of Figares $42 since the 
proxy 1.401 and the MTA 1402 are assumed to interact via some form of Inter Process 
CommunicahonspPO) '^latm^j^ri^^s-ooi^tHaoatl'OifS. for which the details of this 
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IPC are irrelevant In mcmdmm wi^ ^W^^ embodiment, the proxy runs or a 
separate tost thai Is part of a firewall (Fig. S). However, the proxy may ftia as iwo 
processes on the same host, nsing a TCP connecdori betwem the two processes, 

In all cases, the proxy host 1401 is preferably a Mail Exchange (MX) host for 
the local domain and is configured to listen on the SMTP port (TCP 25) for 
connections from remote hosts 1400. the preferred embodiment, the proxy runs on 
a Unix system arid the Unix inetd (Intexnet 'Daemon) program (not shown) is 
configured ( via the /etc/lnetd,conf file) to start a separate instance of the Active 
Filtering process when it receives the TCP connection to port 25, Thus, the proxy 
process 1401 handles a single message and exits when it has either rejected the 
message or transferred the message to the MTA. 

Cpnnect-Tiroe IP Address Filtering 

Operation of the fiber proxy 1401 will how be described with reference to Fig, 
14, which is after the proxy 1401 receives a connection 1403 .from the remote host 
1.400, Starting at step 1404, the prosy 1401 gets die remote host's IP address and 
hostname from the Domain Name System (DNS), This is typically performed by 
cal ling the getpsemameQ function to get the 32-bit IF address of the connecting host 
and then converting it to a : dottod-^.uad-format{e.g.,l 92, 1 68,200,201), 

The proxy then calls gsmosibyaddr() to get the remote hostname (eg,, 
"smtp.femote.dom") fern the IP address and calls gethosthyname() to verify the 
consistency of DNS information about the remote host. Properly configured hosts 
have a DNS Po inter fPTR) record that maps the IP address to a host name, and an 
Address (A) record that maps the nasne to the cone-spending IP address. At the end of 
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this step, Sie proxy 'has both fee IP sMress and name 0f defined in DNS), as well as 
an indicator as to. the consistency of this irdbrmatien. 

At this stage, the proxy only acquires naming information about the remote 
host, ft does not, at this point, decide to reject the message, In addition, the 
administrator can provide a filtering configuration rule thai blocks mail from hosts 
that do not have a valid DNS configuration. In keeping with, the genera! theme that 
most of the spam problem is because oFnusconfigared systems (miscenfigured open 
relays, and the failure of ISPs to use their own packet tenters to stop outgoing SMTP 
from their dialups), there arc also many rmseouhgnred name servers. So it k possible 
the proxy could get a connection .from arty servers in Table L 

CoKtsectiag host mm® from valid address, from 

Host gsshostbyaddn;} gethonbytssinsO 



:-^::-;-;;;;;::;;:;;;:;;;:;;:;;.-;.;:;-;^c; 



192.168.20Cj.200 sbcjmiQteAvm 192. 168.200,200 costmstem Mo 

192.16S.200.20j smip.Raxiote.iksm 192. itscoBSisieat iafo 

192. 1 68. 200.202 «aaVgjkbl« ivs mcompkte bib 



Table I 

At step 1405, the proxy determines if the remote host is categorized as trusted. 
Trusted networks are usually defined manually by using a suitable editor to cuter IP 
addresses of trusted n^dj*si»t^feie : lras4e(i : ikt^se 1093 (Fig, ?}. The proxy looks 
up the host name mW address M a database of Misted fcetwoilt names and If address 
blocks. This database is preferably a single linear file A host name, e.g., 
"host37.ren5ote.dom" matches an entry ''remotedorn'' if the two strings match from 
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the last byte forward, for the length of the shorter siting* if the host is trusted, 
processing continues with display of the greetmgmess&pfe step 1409. 

If the host is not mg^-i^i^sMa-^Kmi6 to step ! 406, which is also 
shown in Fig. 1 3. Here, the proxy toermines whether the remote network has been 
blacklisted. The proxy compares the IP address .of the remote host 1400 with entries 
in the blacklist database. Prefeabiy 5 the hlaekhst database is Implemented as a linear 
file containing one filter per line. Each filter consists of an ASCII doited-quad, 
address followed by a forward slash 7" and the number of bits to be compared, for 
example, "i 92. id8.200.201/24", with -optional textual ^formation such as the date the 
filter was created, the host name, and the reason. The proxy compares the remote 
host's IP address with a filter entry by converting the two IP addresses to 32~bi! 
values, 3£OEs the two values, and right shifts the result so that only the specified 
number of bits ^e.g,, 24) remain. If the result is sero, then the remote host 1400 
matches that particular filter. 

However, the proxy can also provide other blacklisting approaches other than 
this type of long-term, IP-based blacklisting. For instance, the proxy can include 
blacklisting by domain name and short-term blacklisting for selected types of 
problems. Blacklisting by domain name is useful when an administrator observes a 
large amount of junk mail from a particular domain, e.g., *\.KR" (Korea), but does not 
anticipate a need to receive any legitimate mail fkan .those domains, in this ease, the 
configura tion dat abase 1098 contains a list of patterns, and i f the connection host 
name matches any of these pMems, die proxy closes meoomjeetion. 

Short-term blacklisting can be used to handle temporary situations (such as 
remote hosts with bad DNS configntatkms} as well as to limi t bursts or 
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retransmissions of junk mail when loag-terrn blncMistiag is not desirable. Short-term 
blacklisting uses an additional blacklist file tot is periodically cleared out by the 
operating system. 

At step 1 408, if the •temote-host is M^Sdisted,. ifee proxy .1401 issues an error 
reply to the remote host (o,g„ "550 SMTP administratively blocked"), closes the 
connection 1403, logs the rejected connection, and exits without any email being 
transferred. The system log 1099 (Fig. 7) may he configured to log on the local host or 
on a remote host, such as the local MIA 1402. If the remote host 1400 is trusted or 
the IF address acquired in 1404 does not match any entry in. the blacklist 1406, then 
the Active Filter displays the SMTP greeting message, step ! 409 

Processing continues as shown, in Figure 1 5 when the remote host sends data 
on the Open connection 1403, At this point, the proxy has not established a 
connection to the local MIA 1402. The proxy connects to the server (Fig- 20} only 
alter validating the MAIL From message. 

The use of linear files for the trusted database and the blacklist database might 
not be optimal for performance in all networks. Accordingly, trusted domain names 
(e.g., "remote,dom") might preferably be maintained in a hashed list or dims file. 
Blacklisted IP addresses might preferably he maintained in bitmap, abashed list, dhm. 
file, or even in Content Addressable Memory (CAM) for increased performance. The 
check for a blacklisted IP address consists of opening the bitmap database, seeking to 
Use appropriate byte, and reading the bit fortbe specified block (e.g,. 192.135.140} of 
IP addresses. If the hit is set, then &o block sfaddfssses is blacklisted, otherwise it is 
acceptable. 

Blacklisted IP addresses are appended automatically to the blacklist database 
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by various sensors is subse^a^mt 'filters' p^thS-Aetrfce Dialup, Active Relay and 
Active User Filters), s«bie«i to a eanfigmMios setting. This permits the Active Filter 
proxy to reset mdekiy to Onods of spam from agaftlenlar host. However, if the 
sensor snakes a bad decision, then the meorreci filter must be manual ly removed by 
editing the file. 



MAE, Message Proces sing 

At Figure 15, the remote host 1400 may send m optional HELD message 
1410, in this -event, the proxy .140.1 simply reads the message in step 1411, potentially 
logs the message, and sends a response 1412 to the remote host. This message is 
irrelevant with respect to junk mail filtering, since no access decisions are m ade on 
the contents of the HELD message, 

At step 1413, the remote host 1400 sends a mandatory MAIL From message to 
the proxy 1 401 , At step 1414, the proxy reads the message from the TCI* conneopon . 
The message must contain an email address, represented as "<mfeddr> M , m the 
Internet address .format consisting of the <xmc-atenat?on of a user name, !> @" sign, and 
domain name. The term "MAIL From address" refers to the entire address passed m 
the MAIL From message, and the term "MAIL from domain** refers to the domain 
name to the right of the "@" sign. The filter proxy also ensures that the MAIL From 
addresses from selected large ISPs, such as AOL.com, HOTM AlLcom and 
YABOO>com, must come ims host with the same name, This rejects a 
considerable amount of spam since spammers often ferge addresses with well-known 
domain names. This aspect, however, is usually only nsefiil for large ISPs. 

At step 14 IS, the proxy checks the MAIL From address to determine if the 
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remote connection is from another Active Filtering proxy 1401, For instance, suppose 
host A siid test B both have an Active Filiemjg proxy handling incoming miml 
connection Host A opens a data connection to host B. Is tarn, host B opens a 
reverse test connection back to host A. When this happens, host A must recognize the 
reverse test connection so that it does not propagate a. cycle where each prosy opens 
reverse test connections to the other, until either the initial connection is terminated or 
one of the proxies mm out of resources. 

This ears be handled in various ways, such as using a reserved MAIL From 
address, (e.g. w r8iaytest@.hostb.somenet4otn") to explicitly indicate that this is a relay 
test. Alternatively, an Extended SMTP (ESMTP) command such, as "XREVTEST" 
may be sent by host B to indicate that the connection is a reverse test connection , 

lit accordance with the preferred embodiment of the invention* the Active 
Fi ltering proxy uses the reserved -address "reverse" with the local domain name on 
each reverse test connection. This reserved address is used by ail Active Filtering 
proxies, Continuing with step 1415, the proxy 140! checks the MAIL From address to 
determine if it contains the reserved name "revme" before the @ symbol If so, the 
proxy issues an error reply 1416 on the incoming connection and exits. The receiving 
proxy then closes the connection when it detects this address to prevent abuse by 
spammers who might learn this reserved address. In this case, the remote host (e.g., 
the proxy at host B) will not he able to test the local host (e.g., host A), but email will 
still be possible. 

A t step 1417* the proxy filter I4Q! skips subsequent checking of the MAIL 
From argument if the connecting hostrmne inatches a trusted database entry, using 
the same method as m step 1405, or if the MAIL From address matches an entry in 
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the system whitehsi The Irus^.^i^iisl^tifi^sMwotks with which there are 
long^ermtrustrelationship^^-^t^M^ fan one of these domains can send 
mail without restriction. If the domain is trusted, step 1417, processing continues with 
step 1470, Also at step 141 % the filter skips suhsequeat checking of the M AIL From 
argument If the M AIL From address (u&sr@dornaio) exactly matches m entry m the 
white'Us? database. 

fe the preferred embodimeat, the whitefist file is a text file that contains 
addresses (one per line) that are periodically mined from sendmaii log entries for 
outgoing ("to-") messages* These log entries are for mail sent by the local 
organisation to destination addresses on other networks;, so adding these destination 
addresses to the whitelist file will ensure that the proxy will permit incoming email 
own those persons that local .users have sent mail to. However, the white! 1st database 
may be 'implemented as a hashed database (e.g., dbm) tli.es, or even ■■could he disabled, 
if the address matches a whitelist entry, processing continues with step 1470. The 
difference between the trusted database 1093 and the whitelist database 1094 Is that 
for trusted hosts, mail is permitted from any user on the remote host to any user on the 
local 'host For whitelist entries, mail is permitted only from the named user on the 
remote host to any user on the local host. 

If the incoming connection is not itself a relay test and the message does not 
match any of the trust criteria, men in step 1418, the proxy 1401 attempts to open the 
reverse test connection to the remote host 1400. This is typically performed by calling 
socket!) to acquire a socket strncfttre to manage the connection to port 25 of the 
remote IP address 1400, thmvajli»g : doto^JiS) : -td.-s^uest the networking software to 
establish a TCP connection using the socket, 
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la step 141% the proxy 1401 chectes fee status of the eoaneet0 call. TCP 
networking implementations fey convention rotors a status of sere if the connection is 
successful, otherwise TCP returns -1 and sets an error nnrhber to indicate the specific 
mot. If the reverse connection is success&i then -the .proxy continues with step 1450 
to perform Active Relay testing (Fig* 18), 

However, if the reverse -connection fells, then the proxy continues with Acti ve 
"Dialup testing in step 3 420 (Fig. 16). Is this ease, the connection must have been 
blocked by the remote network (the remote host \ 400 cannot he offline since it has 
just connected to the proxy server). In most of these eases the connection wi ll fee 
actively refused by the remote host 1400 (or its packet routers). This will result in an 
error of ECONNREFUSED 61. In a few cases, the remote network may silently block 
the TCP open request to the remote host 1400, without giving an em* response, in 
this case, the local networking software will return ETiMEDOUT 60 as the network 
error, 

Aetiyo,Oi§lypi^ejon|;§r^ 




As noted above, email from dialep PCs running direct SMTP programs is a 
major problem since the spammer cat use die program to forge any protocol field or 
message header field Approximately one-tbird of the Junk mail attempts are from 
ISP diahsp addresses. The spammer almost always uses a relatively inexpensive 
"throwaway" dt&Mp account with an internet service provider (ISP). These dialup 
accounts typically have certain characteristics imposed by their respective ISPs. 
Because of the use of dynamic name allocation (e.g,, DHCP) and because of pricing 
strategies, the ISP permits the user to only operate as a client. That Is, the ISP uses Its 
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packet routers to block network service requests such as SMTP to their dump users. 
The second characteristic of dklnp ^counts Is that most ISPs «se a regular naming 
scheme for such dialup addt®s^-:^m-t»'ss^U^maintena?sceofth.eDNS database. 
The names frequently include decimal at hexadecimal representations of the last byte 
of the IP address. 

in Figure 16, the proxy 1401 continues with Active Dialup .detection at step 
1420 after deteminmg at step 1 41 9 that it cannot establish a reverse test connection to 
the remote host 1400, This method is performed only for untrnsted hosts, sines the 
reverse connection is not attempted for trusted hosts, step 141.7, 

At step 1421, the proxy 1.40! attempts to determine if the IP address or domain 
name matehes a non-dial up entry in the -dialup database. The Dialup configuration 
database I Of? (Figure ?) .lists blocks of non-dialup addresses that otherwise meet the 
criteria for dialup (i.e., will not accept a reverse connection and have a sequential 
naming scheme) but that are known to not be dialups. For example, an ISP may have 
seqne.ntiany-narT.5ed nuiihosts with some roailhosts dedicated for outgoing mail and 
some maiihosts dedicated for incoming mail, 

To continue with step 1421, there will typically be only a few entries in this 
database beeaose most non-dialups are characterized correctly by the Acti ve Dialup 
method, The remaining entries are oommon across the Internet and can be pre-defined 
and installed along with the proxy servefc it may fee necessary to add an entry to this 
database whenever any ISP installs or renames a block of maiihosts that appear nmeh 
like dialnps, but this too can be centrally distributed. It is preferable to list these few 
address Mocks than to attempt to Mentily ail possible dialop addresses. 

The addresses in step 1421 are prelbrably expressed as a detted-quad IP 
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address, a slash f> /% and a number ef bits to be matched. For example, the filter 
192,1 68,200,20! /24 matches all addresses between 1 92, 1 MM® J and 
192. 168.200,255. An address matches a particular filter if the filler address 1 097 
(Fig, 7) aM the remote host 1400 IP addmssmaich for the specified number of hits. 
For example, the IP address 192.ld8.20G.29 matches the filter i92,16S.200.201/24 
because the two addresses are identical for the first 24 bits, i.e., 192. 168.200. 

The preferred embodiment uses a fiat ASCII file structure for the diahrp 
database. If the requirement for non-diaiup entries grow significantly, other 
representations (hashed lists, dbm files, or CAM) may be desirable for performance 
reasons. If the IP address matches any entry, then the proxy 1401 bypasses any forther 
dialnp testing, and proceeds to step 1 901 . Relay testing is not chndueted stnee the. 
filter has already determined that the reverse connection cannot fee established to the 
remote host at step 1419. If it does not match any entry in the non-dlakp list, then i t 
proceeds with diahrp testing in step 1422, 

At steps 1422-1424, the proxy 1401 compares the name of the connecting host 
with its immediate neighbors, using a heuristic approach to correlate a sequence of 
names as diahips or non-dialups. In the preferred embodiment, a threshold total of ten 
match points are required to classify a remote host as a diahtp. This approach takes 
into account the remote host oame t etoraeter sequences in the name, and sequential 
nature of host names near the IP address of the remote host 1400. 

At step 1422, the filter scans the node name of the remote host 1400 for 
certain sequences and adds or snMraets points. The node name is the part of the host 
name up to the first period, Fer example, the node aam.e.:jfor' ,, <fiaV37»ismote,dom" is 
"dlab37". The preferred embodiment obtains this infomMioo fiora an entry in the 
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dmlup configuration database 109? (Fig. ?), wMcfc contains text strings and associated 

points, separated by a slash '7", as fisted in Table 2. 

did/5 Pff^S stip/5- 

dhcp/5 snitp/~5 msxl/S 

TABLE 2 

In the preferred embodiment of Table 2, five points are assigned if the node 
name contains "dial--, "ppp H (for PoinWe-Point protocol), "slip" (for Single Line IP), 
or "dhcp" ( for Dynamic Host Configuration Protocol), Five points are .subtracted i f 
the node/name contains "smtp" (Indicating a SMTP host) or "mail" (indicating a 
mailhost). Of course, other sequences and point values may also he used. The above 
confighratien data can be extended to include other sequences that may subsequently 
be associated with dial up hosts. 

Not all remote hosts have a consistent name and IP address assigned in DNS, 
as described for step 1404, That is, a call to getimstbyaddrQ Or gethostbyname() will 
fail, or the returned information is not consistent. The Diaiup DB 109? has a Reject- 
unknown-dial option to either reject, the connection with, an error message indicating a 
DNS naming error or continue processings thus relying on other filter layers to catch 
the problem. 

At step 1423, the proxy 1401 compares the node name of the remote host 1400 
with its neighbors and assigns additional points If lbs names appear to follow a 
sequential naming scheme. Further to the preferred embodiment, the proxy compares 
names of .neighbor hosts by perfomhHg the Jbfiewhig actions for all IP addresses that 
are within the range nnn-10 to nnn+1% where nan Is the node address (last byte of IP 
address) of the remote host 1400* Details of step 1423 are provided in Fig, 17, For 
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example, the MtowmgTafck 3 for a remote host 

10uMli?4^ shows the IP 

addresses ami node names for its 20 nearest neighbors. 
Offset IF Address Node Name 
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6 63,11.217.123 lcustl23 

7 63.11,2! 7,124 !eu$ii24 

8 63.11.217,121 lcustl2S 

9 63.11.217.126 IcustlSo" 

10 63.11.217.127 ieustl27 



Table 3 - Neighboring IP Addresses for 
Remote Host 63< 1 1 .21 7.1 17 



This example shows bow mis particular ISP sequentially named sis hosts over 
the range to .'be considered (and, indeed, throughout the entire block of addresses) . In 
this ease the last byte of die IP address is identified directly m the node name* but this 
Is not necessary for this approach to work. 

The prosy ean consider either node names or complete host names in 
evaluating whether the remote host exists within a sequential nsxm space. In general, 
it is more efficient m consider node names, however, an. ISP can organise a dialup 
name space so that the sequential earning scheme occurs within a» intermediate node 
of the harpe, such as the IP addresses 24,65.51 .66 and 24.65.5 1 ,6? lor the names 
24..65.S 1 .66.m.wave.home.eo^-^ : -24.M J:t»6?,o»,wave.lx>me«com s respectively. 

At step 1424, the proxy 1401 compares the total current number of Match 
points from steps 1422 and 1423 wi th the threshold number of pints (10, in the 
preferred embodiment) required to characterize the remote host as a dialup. If the 
number of match points exceeds the threshold, then it exits, step 1425, Otherwise, 
message transfer cont i nues with step 1901. 
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At step 1425, the proxy 1401 issues an SMTP error message (e.g., "550 
apparent dialup*} and wis, thus ciosfog the data connection without any mail being 
transferred. In the preferred embodiment, the proxy sis© iogs the rejected dialop and 
adds the IP address of the remote host to the blacklist database. 

Figure 17 shows &r&er detail of the processi ng flow for step 1423 in 
accordance with the preferred embodiment. Step 1500 calculates a 32-bit IP address 
for the remote host, which is used in step 1504 to calculate the IP address of one of its 
20 neighbors. Steps 1501, 1502, and 1503 perform the remaining steps shown in the 
figure tor x r -!0 to x ::: h-10, Inclusive,, while skipping the remote host at x«0. When 
the loop is finished, the proxy exits to step 1424 of Fig. 16, which c.kssi : fe the 
remote host as a dial up or uon-dialnp based on. the accumulated number of match 
points. 

Step 1505 limits the name comparison to the 8-bit (Class C) address block that 
contains the remote host to avoid comparing a remote host name in one ISP with 
neighbors in a block operated by a different ISP, It XORs the 32-bit IP address for the 
neighbor x and the IP address for the remote host and shifts the result right S hits. If 
the result is non-zero, then the neighbor x is hi .a different address block than foe 
remote host, and is skipped. Thus, the range is absolutely bounded by a minimum 
node address of 0 and a maximum node address oF25S s so that the comparison for 
remote host 1 92,168.200.2 wooM only consider node addresses from 0-1 and 3~ 1 2, in 
order to avoid comparing names in other S-bh blocks of addresses. 

At & -minimum, ten addresses will always be considered. Preferably, 10 names 
are matched eel of 20 £n-I 0 to n+10) so that if the remote host is at the beginning of a 
blocks e.g.,, 1 92. 1 35< 140.0, then there will still be ten opportunities to match irons 1 to 
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10, and if the remote host is at the end of a Mock (e.g»*. .192, 135.140.255, there will 
still be tea opportunities' i$mafeh 'hi-ibe «sage .245 -to 254, 

Steps 1506 and 1 507 call gethostbyaddrX) to get the host structure for the 
neighbor x, which contains the hostname. Errors do not terminate the comparison, 
since there .may be gaps in the BNS informatioo near the remote host. Steps 1 509 and 
1510 compare the respective lengths of the remote host name and its neighbor x> If 
either is more than one character longer than the other, then skip fee neighbor x 
because the two names do not appear to be past o f a sequence. 

Step .15.11 scans forward and backwards to identify the sequence of 
non-matching characters m the names of the remote host and its neighbor** This 
sequence may contain substrings of matching characters, bat as shown in step 1 5 12 , i f 
either string is greater than three characters in length, then skip the neighbor x because 
the two : names do not appear to he part of a sequence. 

Step 1513 scans the two strings from the names of the remote host and the 
neighbor x to determine if either contains a hexadecimal-only digit, i.e., a character in 
the range a-f. Ifso s it. Sets the hexmode flag. In step 1514-1516, the proxy 1401 
cheeks the hexmode flag and converts; the string for the host x to a hexadecimal or 
decimal value, based upon the setting of the hexmode Dag. 

In step 15 1 7- 1 5 1 9, the proxy calculates the absolute distance between the two 
name sequences. If the disfancs isiess than or equal to the absolute value of x, then 
the names appear t^'W.p^'dt:a's^ae^.3».d'tile'i}ia^h counter is incremented, For 
example, Table 4 shows the distance as correlated to the offset x for the four nearest 
neighbors of the j^otfe^^.634.1>2I7-nf,b^^:©a-fe'Mon«ation m Table 3. As 
shown in this example, the distance cabukted for each of the fo ur nearest neighbors 
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is identically equal to the difierenee m IP address values, thus the names are part of. 
sequence. 



IP Offset (x) IP Address Node Name Distance 

~2 63. 11. 217.1 IS IsusttlS 2 

-1 63.H,21?J16 leastlM 1 

0 63.H.217.U7 least H? skipped 

! 63,11.217.118 icmillS 1 

2 63,11.217.119 Imistll9 2 

Table 4 - Distance for IF Addresses 
Neighboring Remote Host 63. 1 1 .21 7.1 17 



The preferred embodiment detailed in Figure 17 provides categorization of a 
dial up host based upon a linear corr elation of its neighbors' host names . It does not 
require each host name to directly encode its IP address (as shown above), but also 
permits other linear relationships. It correctly handles address blocks that have one or 
more legitimate mailbosts, usually at the beginning of the block, hut use the rest of the 
block for dialup addresses. It permits 'discontinuities in the name space, provided the 
remote host is past of a name sequence that is sufficiently long- It correctly handles 
node addresses at the top (&§., 255) und bottom (e,§* 0} of a Class C block of 
addresses, where name discemiouities are most likely to ocenr. ft also permits 
fixed-width names (e.g, 9 "801" throngh ' -255"} and variaMe-widih names (e.g., "1" 
through "255"). And* as indicated in Figure i6 s it permits either decimal or 
hexadecimal encodings. 
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The preferred emb^m^is^pi^m^'IKese Active Dialup methods with the 
blacklist filter \ 406 (Figure 14}, Bulk uiattes who use the SMTP direct mechanism 
will typically retry from different (dynamically assigned)!? addresses, but j&equently 
from addresses in the same -Class C address range. By adding the IP address to 
the blacklist database, nominally with the number of bits to he matched set to 24, the 
mechanism takes advantage of the relative speed of a blacklist database lookup as 
compared with subsequent iterations of the above active diahtp mechanism. The filter 
can he left perpetually in the blacklist database, or preferably removed from the 
blacklist database If it is not used in some number of days or weeks. Further, filters 
can he manua lly added to handle any blocks of dialup addresses that are not identified 
by die Active Dialup method. 

The preferred embodiment provides for Active Diatom detection following 
reception of the MAIL From message on connection 1403. This permits logging of the 
M AIL f rom address and also prevents cbstiauous cycles of reverse test connections 
since Active Dialup detection occurs after the reverse connection test in step 141 S. 

However*. t» so. alternate embodiment a proxy might perform Active Dialup 
detection immediately after the proxy receives a connection 1403 from the remote 
host 1.400, To prevent continuous cycles of reverse test connections,, the proxy would 
then preferably perform name categorization in steps 1422, 1 423, and 1424 before 
attempting to open a reverse test comisctiom Further, if the test connection is opened 
successfully, the proxy must &en Imrnediateiy close the connection, thus causing a 
remote Active Filtering proxy to exit before it can open a reverse test connection hack 
to the current proxy. 

The preferred embodiment shown in Figure I? depends upon administrators of 
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remote networks ci>ntin«mgto-'ui^:ldp^l-^^^^'wh.ai they assign names to their 
diafup addresses* Sisefe sequ^M to define mtl maintain, so it is in 

the interest of these sdsumstnrtej^ to Ose a sequential naming scheme. However. ISPs 
might also assign irregular names or name lengths to their node addresses. The only 
conceivable reason for making such assignments would be to permit their users to 
avoid Active Dial up detection, so such addresses can he manually added to the 
blacklist database if they are not appended by subsequent Active User filtering. 

The following alternative embodiments are more flexible with respect to 
detecting such irregular naming sequences, but have other side effects as noted- These 
embodiments could be used i» step ! 423 in place of, or in conjunction with, the 
preferred embodiment shown in Figure 17, 

hi accordance with the alternative embodiment shown in Figure '22, the proxy 
can categorize a remote host as dialup or nen-dialup based on the edit distance 
between the remote host's name and its neighbors' names. For example, a change of a 
■ *3" to a "7" invol ves an edit distance of one. as does insertion of a character, or 
deletion of a character. In conjunction with its failure to establish a reverse test 
connection 1 4.1 8, a proxy can conclude that a lo w edit distance is evidence that the 
remote host name is part of a set of closely- related names consistent with a diahip 
name space. This method could be used to replace the method shown in Figure 17 
(and referenced in Figure IS step 1423) where names are closely-related, but not 
necessarily sequential. With respect to Fig, 22, steps 1500-1520 are as described iu 
Figure 1 7 and provide a method of acqniriijg tr^ neighboring host names for the 
remote host 1490. ha step 1530, the proxy accesses the. Dialup DB 1097 to acquire the 
threshoid value to be used in the remaining steps. 
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Possible threshold values are i» 2, md 3, since an edit distance of 0 would 
indicate identity and. thus not be usefel aa edit distance greater than 3 is too broad 
and would result in ndscategoriaatinn of nonnllalup hosts as dialups, An edit distance 
of i would indicate a high degree of conelstion C«aj»es would match only if they 
differed by one character), but would fail to match names such as "dk!39" and 
"diaMO". A threshold of 2 is probably optimal, even though it would Improperly 
categorise rollover situations such as ! 'dlal99" to "diallOO". A threshold of 3 would 
address the aforementioned rollover problem, but would mtseategorize, for example, a 
remote host "mail" surrounded by a sufficient number of hosts with, names such as 
"Main", "maii2% and ^menu" Is step 1 53 1, the proxy scans each character of the 
neighbor name and compares it with the corresponding character in the remote host 
name, if the two characters are identical (step 1532), then the proxy advances the 
character pointer m the two names, In steps 1533, 1 534, and i 535s the proxy 
deternim.es if a character must be replaced, inserted, or deleted to make the neighbor 
name consistent with the remote host name. If so, it increments the edit distance 1536 
for the neighbor papas and continues. When the comparison is complete, the proxy 
then checks if the accumulated edit distance Is less then or equal to the threshold read 
from the Dialup DB in step 1530. If so, it Increments the match count 1 538. The 
proxy then continues with the next name, as determined by step 1 502. 

In another alternative embodiment, the host may be categorized as either 
dialup or «o»~diahip b^^.ofr.a : ep3^l?tfwy«!ee:bet^ft names and IF addresses. 
The correlation of a set of £x, y) values is a ekssle statistical problem, A low val ue 
(0.0) indicates no correlation, while a vaioe approaching 1.0 indicates a high 
correlation. The x value in bus case Would be the node W address (e.g., 107, 108, 
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109, etc.) and the y valaewoald he some .nam^-t^i«asaatatioa of the host name 
(e.g., ICastlO?, ICastlOS, iOu$tl09> ete.). M this example, the correlation would be 
exactly 1.0 because the naaies are assigned sequentially, feat a lower conflation might 
still be evidence of a dialup address 

Still another alternative embodiment involves categorizing the remote host 
based on the ability to establish reverse test connections to its neighbors as well as the 
remote host itself, step 1418 (Fig, 15). If asafficient number of neighboring addresses 
also do not permit reverse test coanecdoas, then it is reasonable to conclude that the 
remote host is a dialup. This method might be used by itself to replace the method 
shown in Figure I ?, or may he comhiaec with the method in Figure 1 ?. With respect 
to Figure 23, the steps 1500-1520 are as described Irt Figure i? and provide a nxeaas 
of stepping through each of the 20 nearest IF addresses for the tomo to host, m step 
BS0, the proxy attempts to connect to the neighbor x, using the same mesas 
desc ribed in step 141$, It cheeks the status for the connection in step 155 1 , If the 
connection is not successful IS52, as would be typical for a block of dialup addresses, 
the proxy increments the match count, However., if the pxoxy is able to establish a 
reverse connection to the neighbor x f it subtracts 2 from the match count 1553 and 
closes the test connection 1554. This weighting permits as many as three neighbors to 
accept reverse connections and still categorize the remote host as a dialup. The proxy 
then continues w ith the next name, as determined by step 1 502, The disadvantage of 
this approach is that it may be tiroe-eoasummg to attempt a large number of reverse 
SMTP connections. However, it is less time consuming to perform this test in SMTP 
filtering software than it is to deal with the spam or junk Mall after it is received on 
the organization's mail server. 
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As described above, various conventional sy^ems ftmlstam static databases 
that list blocks ofdiakp addresses. However, me problem with these lists is that they 
are static {not dynamic). Gorssequeiitlv, a new Mock of addresses can be abused by 
diaiup PCs numerous times before it goes on cne ofthese lists. Further, if such a 
block of addresses is blockaded, and subsequently reused for a legitimate nmiiserver, 
then legitimate mail will be rejected because of the history of that IP address. 

An advantage of the Active Diaiup filter of the present invention is that it is 
dynamic, and so categorises the remote host at connection time based on 
non-response to a reverse SMTP connection and certain characteristics of the names 
o f the remote host and its neighbors.; 

Active Relav Testing 

3ti Figure 18, the proxy 1401 continues with Active Relay testing si step 1450 
after successfully opening a reverse test connection to the remote host (step 1419, 
Figure 1 5). Active Relay testing is performed only for non-trusted hosts {as 
determined in step 1417) and if the reverse connection 1418 was successfully 
established. The Active Relay test characterizes the remote host with respect, to its 
perceived acceptance of a reply to the supposed sender and whether the remote host is 
likely to be an open relay. 

The proxy 1401 performs Active Relay Testing by testing the validity of the 
MAIL From address on the reverse connection and implementing a relay test, such as 
the one shown in Figure 5. These tests ate conducted while the remote host 1400 is 
connected to the proxy 1401 as a factor in determining whether to accept the remote 
host's message. If the remote host 1400 gi ves an indication that it will not acc ept a 
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reply email message to me par|*sied sender or that .it will play a test message from 
the proxy 1401 , then that remote host is at an increased risk for relaying mail from 
someone else to the local MTA 1482. On fee other hand, if the remote host indicates 
that it will accept a reply message for the sender ami that it will not reky for the 
proxy, then the remote host probably is not as open relay. 

The proxy 1401 performs this test using the reverse SMTP connection 1418, 
then continues with the protocol transactions in steps 1454, 1456, 1458 and 146$, The 
test simply .monitors the responses from the remote host, and does not actually send an 
email message to the remote host 1400. The local MTA 1402 is not involved in this 
test. 

At steps 14534467, the proxy server .1401 performs the active relay test Steps 
1433 through 1458 are preliminary steps required to progress to the relay test, white 
steps 1460 and 146? provide the answers to the relay test la step 1-433, the proxy 
server 1401 reads the remote host's greeting 1452 from the open connection 1413. As 
indicated In item 101 Tin Figure 2, when an MTA receives a connection to its SMTP 
port, it writes its system greeting to the connection to indicate that it is ready to 
receive mail. The proxy reads and discards each line of the greeting, handling 
multi-line greetings as described for Figure 2, since the greeting does not contain any 
useful information. If any read tails or if the first three characters of the greeting are 
not "220", the proxy exits from the relay testing sen, nonce and continues with Active 
User testing of the supposed sendef s maiihosr in step 1 901 . 

In s tep 1454, if the greeting is received wi thou t errors, the proxy sends a 
HELO message to the remote host. The text of the message is a configurable string 
(read from the configuration database 1098, Fig.7), defined when the proxy is 
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installed, and typically IdemlfiM the focal host name, ft shotdd be noted that the 
HELO exchange is optional In Most cases. However, some hosts require a HELO, 
which is therefore preferably hsefoded in the rMay test 

in step 145 5* the remote host 1400 sends its reply to the HELO message, in 
step 1456, the proxy 1401 reads the HELO .response sent in step 1455, if the read fails 
or if the first three characters of the greeting are not "25£f s the proxy exits from the 
.relay testing sequence and continues with Active User testing of the sender's maiihost, 
step 1901. 

if the HELO reply is received without errors at step 1456* then the proxy 
issues a MAIL From message to the remote host As noted above, the preferred 
embodiment uses, the reserved .name "reverse" to notify another Active loitering proxy 
that this is a reverse test connection so that the remote proxy can avoid a connection 
loop. The address should identify the local domain name (for legal reasons); Thus, the 
message appears as MAIL From: <reverse@l0e8i.dom>. 

The remote host 1400 will then respond as shown in step 1457, typically with 
a "250" response (to indicate acceptance) or a "550" response (to Indicate 
non-acceptance). As shown in step 1458, if the remote host 1 400 replies with 
anything other than a **250" response, or does not respond, the proxy exits from the 
relay testing sequence and coatinnes with Active User testing of the sender's niailhost, 
step 1901. 

If the remote host 1400 accepts the proxy's MAIL From message on the 
reverse connection 141 8, then the proxy 1401 issues a RCPT message 1458 to the 
remote host on the reverse test eonneetion. The proxy gives the complete M AIL From 
address of the supposed sender of the message C"m#Idr !, jS as received previously In 
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steps 1413 and 1414, This is eq^ivaiesi fe> damg a real-time reply to the sender of the 
message, except that no data is transferred. 

In steps 1458 and 1460 the proxy determines if the remote best will accept Che 
address of the supposed se;ider of the message. This may appear to be designed to 
delmmne the soma! existence of Che user * ! mfaddr' t as is performed by the Active User 
test at step 1901 and there is some overlap if "m&ddr" actually exists at the remote 
host 1401 . However, in the general ease, this step is designed to determine if the 
remote host is configured to deliver (either locally or by relaying) email to the 
supposed sender. If at step 1460 the reply, is '"250" then the remote host will accept 
•mail to this address, so the proxy continnes with step 1462. Otherwise, if the reply Is 
anything other than "250", the proxy sends an error response 146! On the data 
connection 1403 and exits, thus closing the data connection 1403 and the reverse 
connection 1418, In the preferred embodiment, the proxy also -writes- a system log 
entry for the rejected message and adds the remote host's IP address to the blacklist 
database 1095 (Figure 7) before exiting. 

For exaniple s smallhostdom may be a customer of bigisp.dorn and uses the 
"smart host" smtp,higisp.dom to handle its mail forwarding and to receive mail when 
it is not online. Thus, if «ntp,higisp,dom connects with MAIL From; 
<sotneone@8mallhost.dom> and accepts a reverse test connection, then 
smtp bigisp.dom should accept the proxy's RCPT To: <someone@smaOhost,dom> 
message on the reverse test connection. However, if smtp.higisp.dom connects with 
MAIL From: <getrichs|«icfc@n«lkmail.dofa> and accepts a reverse test connection, 
hut will not accept a reply to the MAIL From address, this is evidence that the 
message is forged. It should fee noted Chat this is only for the situation where the 
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remote host will accept a reverse test eoone^timr. If dM network will not accept a 
reverse test connection bec^&s@it'te : ^f^^' : '^^2miag and .outgamg tnail servers, 
then processing wilt, follow the Active Bi&iup 1420 path as shown in Ft gar© IS, 

At step 1462, the proxy 1401 attempts to find if the IP address of the remote 
host 1400 matches a non-relay entry in the Relay database 1096 (Fig. 7). This 
database lists blocks of addresses that the local organization must exchange email 
with, bu t which would fail the relay test There might typically he between about S-S0 
entries in this database, with each entry co vering a block of addresses, These entries 
can be pre-defined by a site survey .performed by each orgaaixation, preferably before 
installing the Active Filtering proxy server. For simplicity, the preferred embodiment 
of the Relay Database 1096 (as with other IP addresses listed In steps 1406 and 1413} 
expresses these addresses as adottednptad IF address, a forward slash f V", arid a 
number of bits to he matched. Other embodiments may use other representations 
(hashed lists, dbra filed, or CAM) for performance reasons. 

If the IP address matches any non-relay entry, then the proxy 1401 bypasses 
any further relay and user testing, and proceeds with message transfer in step 1470, If 
the IP address of the remote host does not match any entry in the non-relay list, then it 
continues wi th the second part of Active Relay testing at step 1.463. Before 
performing the relay test, the proxy compares the MAIL From domain with the 
connecting host name. A match at step 1463 occurs when the connec ting host name 
and the MAIL From domain are Identical, beginning at the end of die two strings, and 
comparing backwards over the last two nodes (he., periods) of the two domains. For 
example, if host *smtp.gamma.dom ! " connects With MAIL From 
"aIpha@gamma.dom H , the two domains match over me scops "gamrna.dom". 
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However, if host "smto.gaoima<donf' connects with MAIL From "alph^beta-dom", 
(he two' domains do not match. 

If the two domains match in step 1463, the proxy checks fee Relay database 
1096 at step 1464 to detemtinei.f an admiriislMtorhas cosfigisred the proxy to 
perform loose rejay testing. With loose testing, the proxy permits mail from open 
relays if the MAIL From address matches the connecting host, name, hi this case, the 
relay test message 1465 is not necessary, so the proxy continues with transfer of the 
message in step 1470, 

IF either of the domains do not match in step 1463 or the Relay database is 
eoahgnred for strict relay testing, the proxy issues a RCPf message 1 465 identifying 
a ■ ■configurable string, defused when the proxy is installed. The default includes the 
name "relayto" aid the local domain name (for legal reasons), for example, R€PT To: 
<rela>*o@l0caL<lom>. The configurable recipient address may he any syntactically 
correct address. Even though a message is not sent the &CPT address is preferably 
hot a real user address in order to avoid address mining by spam site administrators. 

The remote host 1400 will then respond as shown In step 1466, typically with 
a "250" response (to indicate that it is willing to relay) or "550" (to indicate that it will 
hot relay). In step 1467, the proxy 1401 determines the status of the reply message 
from the remote host 

If the reply is "250% then the remote host will apparently relay for the proxy, 
so the proxy rejects the message as indicated M step 146S and exits, thus closing all 
connections. In the preferred embodiment, the proxy also writes a system log entry for 
the rejected message and appends the IP address to the Macklst database (Figure ?, 
stem 1095). If the reply to the RCFT message 1462 Is anything other than "250", then 
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this indicates that the remote host Is not an open relay, and I so the proxy continues 
with reception of the message at step 1470, 

Thus, the proxy sends two RCPT messages ( M5§ and 1465} to the remote 
host. The remote host must respond with a "230* to message 1458 and. with anything 
other than "250" to message 1465. This establishes that the remote host Is responsible 
far the MAIL From address received in step 1414 but is not. an open relay that will 
accept any RCPT address; Other combinations are described in the following 
paragraphs. 

If the proxy responds with a "250" to both RCPT messages, then it is not 
possible to tell if the MAIL From address 141$ is legitimate or not. In the previous 
example, & legitimate message from <sx>meone@smai]hoshdom> may be som via an 
open relay smasthost smtp.bigisp.dom. Alternately, the message could well be forged, 
slrsce the: remote host is an open relay For example, referring to step 1064 of Figure 4, 
the spammer forged the MAIL From address "goodi@relay.dom". Thus, when the 
proxy receives "250" responses to both messages, the preferred embodiment of the 
Active Relay method is to reject the message and log its rejection. If a subsequent 
review of rejected messages shows a legitimate address, the administrator of an 
Active Filtering proxy can then add the individual address to fee whitelist database 
(Figure 7, item 1094} or can bypass relay testing tor smtp.higisp.dom by defining it as 
a non-relay m the Relay database (Fig. ?}, 

The remote host may also respond with a "5 50" {for example) to message 1.458 
and a "250" to message 1465. Some hosts will permit promiscuous relaying but reject 
any non-existent local addresses. In this ease the pmesy rejects the email message. 

The remote host may respond with a "550" (for example) to both RCPT 
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messages, Id this osss the remote test itself is probably riot m open relay. The bad 
address is probably either forged by a user at the remote tat or there exists a 
multi -hop relay path ihrotrgfe scnne other host that is trusted by the remote host 1 400. 

The adnunistratormay manually edit the relay database to add an IP address if 
a review of log entries shows that the remote host is aa authorized "smart host", ie>> a 
host authorized to handle the local network's outgoing email. In addition, certain 
MIA programs give a "250" reply to the ECPT message 1465, hut then discard the 
message later on. These .may be configured as trusted or as non-relay. 

The active relay method permits automatic rejection of all email seat fern a 
user, nt m open relay host or relayed fey an open relay host. However, in some cases; it 
is necessary to o verride tins behavior, for business or other reasons. For example, with 
respect to the hlgisp example given above, the administrator of the Active Filtering 
proxy can configure the proxy to permit email sent from small.host.dom and relayed 
by bigisp.dom by arty one of the following actions: (1) defining bigisp,dom as a 
trusted domain, (2) adding a whiierist entry tor the specific address 
%ser@smallhost.dom' , , t (3) or adding a nors-relay entry for bigisp.dom, even though it 
is an open relay, 

A pre4nstailatmn site sarvey eanamiematesuch problems by reviewing 
system logs and testing ah hosts that have recently connected to the local organization 
for relaying. Any open relays can then be ceMlgured in the respecti ve databases 
before the Active Filter proxy is installed, Mo special actions are required for hosts 
that do not relay or for hosts that do not routinely send mail to the loeai organization. 

With reference again to Figure 18» if the Active Relay method detemimes die 
remote host 1 400 to he an &pm relay, it sends an error message 1468' to the remote 
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host, appends the IP address of fee ©pen relay to the hiasiaisi file, writes an entry to 
the system log, and exits wittota^femiig^tn^sage to the local MTA, 
Otherwise, if the remote host is net m open relay, fee proxy continues wife message 
transfer at step 1470. It is noted thai relay testing is more accurate than user testing, 
so feat user testing may be skipped. 

The preferred embodiment of the Active Relay method performs testing in fee 
order shown m Figure 18. In particular, this ordering avoids fee relay test sequence if 
fee -remote host is eonitgored as a non-relay (step 1462) or fee domain s match and 
loose relay testing is configured (steps 1463 and 1464). However, alternative 
embodiments may provide for other orders of testing. 

The Active Relay method makes decisions based upon responses to the 
proxy's BGPX requests. However, certain MTA products issue a "250" to these 
RCPT requests but defer enforcement until It receives a DATA message or the closing 

indicating fee end of the message. The proxy can be adapted to recognise these 
products by looking for certain characteristic data (e.g., fee product name) In the 
greeting message 1452. hi this event, the proxy can monitor subsequent responses for 
these products, such as a response other than **3S4* to a DATA message. 
Alternatively, the proxy can be extended to accept certain remote bests based on 
product type. 

Active User Te stin g 

The Active User method illu^Med in Figure 1 9 determines if foe MAIL From 
address 1413 is acceptable to a mailhosl 1§00 configured to receive email for the 
MAIL from domain. By convention* this matfeost is either a Mail Exchange (MX) 
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host or the host identified in the MAIL From message. This method uses the same 
SMTP messages deseribed&r the Active Relay method {Figure 18, steps 1454-1 458), 
but in most cases the proxy accesses a di femt mailhost than the remote host 1400. 
White the Active Relay testis concerned with determining if the remote host 1400 is 
at risk for sending relayed or forged email, the Active User test accesses the mailhost 
1900 responsible lor receiving email for the MAIL From domain to determine if i t 
will accept email for that address. If it does not accept the MAIL From address, then 
this indicates thai the MAIL From address is probably forged and does not exist on 
that network. 

For example, assume the remote domain .remote.dom has two mail servers, 
Otit.remote.dom for sending outgoing email and .mx.rensote,do.m for receiving 
incoming email. If the proxy 1401 recei ves a connection 1403 from out ren:mte.dom, 
the proxy will be unable to establish a reverse test connection 1418 to that particular 
host because it is not configured to accept incoming SMTP connections:. Assuming 
that the host names surrounding outremote.dom do not appear to be dialups, it 
remains for the Acti ve User method to attempt to validate the MAIL From address. In 
this case the proxy 1401 would find the MX host mx.remote.dom and query that host 
as to the validity of the MAIL From address. 

This Active User method is not completely accurate by itself but does provide 
an additional level of testi% vrfseu die proxy cannot query the remote host 1408 via 
the reverse test eonneetion. Consequently, this method is used following Active 
Dialup testing (where there ».iiQ..mve^ ; ^t<^nE^ibii'MlS) and after encountering 
errors in Active Relay testing. 

The reason this method is not highly reliable by i tsel f is that some large 
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networks (A0L.COM, Y AHOO-COM, MSH.COM) accept (Ami Is, give a "250" 
response to) a!! RGFT messages Mgn^iysng proper!y-&rmatied addresses on their 
respective networks. This is done boils for performance reasons and to prevent 
outsiders from verifying or collecting addresses on that network by sending many 
different possible addresses and .monitoring the responses. Still, enough networks do 
truthfiiiiy respond to RCPT messages to make the Active User method useful as a 
supplement to the Active Diakp and Active Relay methods. 

With respect, to Figure 1% the Active Fslierisg prosy 1401 begins operation at 
step 190! subsequent to either Active Diakp or Active Relay testing; In step 1901, 
the proxy identities a mailhost 1900 responsible for receiving mail to the MAIL From 
address 1413. The proxy searches the Domain 'Name System (DNS) ■ information, for 
the MAIL From domain for records identifying Mail Exchange (MX) hosts for thai 
domain. MX records include a host name and priority value, and by convention the 
lowest priority value identifies the MX host that should he tried first. In the preferred 
embodiment, the proxy uses revolver library routines such as the DNS BIND res init() 
and rss_q«eryfj- functions to access the MX records, however, other methods may he 
used to access the name server, if no MX host is found, then the proxy uses the MAIL 
From address (that is, the host name to the right of the "@" character m the MAIL 
From address) as the maikost. 

In step 1903, the proxy attempts to connect, to the mail server identified in step 
1002, This follows the same mechanisms described in step 141 8, except that the TCP 
connection is to the identified maikost 1900 rather than to the remote host 1400, if 
the connection is successful in step i!^-0ie;jsmxy-wg&,at.step 1906 for the system 
greeting 1 905.&O88 the' .Jtmlh^.O&emis^.if connection is unsuccessful, the 
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preferred embodiment of the proxy simply proceeds to step 1 470 for message transfer 
to the MTA. la alternative erabodimsnis, the pmxy might successively check for 
lower-priority MX hosts if the highest priority host is mst available. 

It is noted that for some networks where the same host handles both incoming 
and outgoing email, the mailhost I 900 may be- the same (that is, have the same IP 
address) as the remote host 1400. la this case, the proxy simply makes a second test 
connection to the same host without regard to having previously tested the MAIL 
From address at this host 

>•■ 

Steps 1905-1912 follow steps 1452-1459 hi the Active Relay test m Figure 18, 
respectively, After receiving the system greeting, the proxy issues a HEXJQ message in 
step 1907, Subsequently, after receiving the HELD reply, it sends a MAIL From 
message in step 1909 with the reserved Active Filtering test address, Ahd finally, it 
sends a RCFF message in step 191 1 with the MAi't Prom address recei ved m step 
1414 (Figure 15), 

in step 1913 the proxy inspects the mailhost's reply to the RCPT message, -If 
the reply is "250", the proxy continues with message reception at step 1.470. .If the 
domain mailhost 1900 gives any reply other that* "250*', then in step 1914 the proxy 
1 401 sends an enor reply to the remote host 1400 on connection 1403, closes data 
connection 1403 and test connection 1903, logs the rejection, adds the IP address of 
the remote host 1400 to the blacklist database,, and exits. 

Preferably, when appending an IP address to the blacklist database 109$, the 
proxy adds a 4-byte IP address, e,g., 192 Ab8.200.45, along with some number of bits 
to be matched. Typically the number of hits is 24 s so that any subsequent couneetion 
by any host in the range 1 92.168.200.0 tlirough 192.168.200.255 will be rejected by 
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iheMmkM mephapism. This takes into account that Class C addresses arc normally 
assigned to organizations inmolnplesof 256, so snbse<pent connections in the 
192, 1 68,200.x range aw no*ms% owned by the Mme irresponsible organisation, so it 
makes sense to block all ©fthetnu Howe«®r* if the ownership is subsequently 
determined to be something more or less thaa a single class C, then the blacklist file 
can be manually edited to block one or more hosts, 

Other than the trusted database, the prefe rred embodiment does not require any 
special databases to control the Active User test mechanism. In an alternative 
embodiment, & database prevents -unnecessarily testing networks such as AOL.COM 
thai automatically respond with a "250" to RCPT addresses, 

Q^ttnect to MTA 

All message rejections to ibis point have not involved any storage {other than 
log entries) being allocated on either the proxy or the MTA host computer. If all tests 
have been successful, then Figure 20 shows how the proxy HOI connects to the MTA 
and transfers the initial SMTP messages required to set up the transfer of the message 
to the MTA. 

In step 1470 the proxy connects to the MTA using the same method described 
for the reverse test connection HIS, In summation, the proxy connects to the MTA 
for messages with any of^ : :fpliQMlg : 6l^etmstics: connection from a trusted 
domain (step 1417); whife!I^MML?«»i% address {step. 1417); or email tram a user 
with an aeeoiun at anonH&alpp, non^lay host (step 1467). m addition, subject to the 
validity of the oser address as teermmed in step 1913, the proxy will permit mail 
from hosts where the reverse emmection fails, hot host is configured as non-dislup 
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{step 1421); mv^e^imeciioft &ys,%txt&]i&i&:&0t 4et«cted -as a dkhm (step 1424); 
.reverse connection succeeds, but relay test is inconclusive (steps 1454, 1,456, 1458); 
reverse connection succeeds, bat host is coiiOgawd as non-reky (step 1462) or reverse 
connection succeeds, MAIL From address matches connecting host, and the proxy is 
configured for loose relay testing (steps ■ 1-463, 1464). 

At steps 1470- 1475 of fig. 20, the posy 1401 connects to the local MTA 
1402, sends the accumulated information received so far from the remote host, ami 
starts to transfer data between the remote 1400 and the MTA 1402. The proxy opens 
the TCP connection 1470 and awaits a greeting 1471 from the MTA, When it 
receives the greeting, it sends the HELO message 1472 io the MTA that it received 
Mm the remote host 1400 in step 1430. If no HELD message was received font the 
remote host, then the prosy does not send die message 1472. The proxy then awaits 
the response 1473 from the MTA, and sends the MAIL From message 1474 that it 
received from the remote host 1400 in step 1433. The proxy 1.401 then awaits the 
MTA ■$ response 1475 from the MAIL Prom message, and writes that response 
immediately to the remote host 1 400. 

In steps 1480 and 1411, the proxy simply receives a RCFT message from the 
remote host 1400 and passes it tfansparently to the MTA 1402.. in steps 1482 and 
J 483, it receives the SMTP reply from the MTA and passes it transparently to the 
remote host The remote host may send multiple of these RCFT messages, each of 
which is handled in the same way, 

m an alternative prefer^ emb^ proxy keeps track of the number of 

recipients (W^.tfc^ ; ;^^t<^.W^e;;MTA and those rejected by die MTA) and 
issues an error message when the remote host exceeds the maximum number of 



recipients configured in the configuration database 1098, 



Message Transfer and jEkse Filtering 

At this point, the- proxy .1401 is operating in a teansparem pass-through mode. 
Prior to step 1475. the prosy operates at the application level, where it handles the 
SMTP messages on behalf of the local MX A. Beginning with step 1475, the proxy 
operates in a filtering mode where it simply transfers data between the remote host 
1 400 and the MTA 1.402, with limited filtering. 

Figure 2! shows the processing steps involved in (referring the actual email 
message from the remote host 1400 to the MTA 1402. Except for the limited filtering 
performed m steps 1491 and 1492, the proxy transparently transfers the SMTP DATA 
command 1484, message header lines 1488, message body lines 1490, 1493, and 
closing protocol tines 149-5 from the remote host 1400 to the MTA 1402, By 
convention, the message header is defined as all lines of the message down to, bat not 
including, the first empty fine. The proxy also transparently transfers SMTP replies 
from the MTA to the remote host (steps 1480 ami 1 491). As is consistent with 
SMTP, the lines of the message header and message body are not individually 
acknowledged by the M TA. 

The proxy may integrate any conventional filtering method, such as one that 
prevents Blind Carbon Copy pee) messages. Such, mail is characterized with a local 
domain address in one of the RCPT To: messages 1480, bat without a reference to the 
local domain in the header lines 1488 of the message> This ean 'either be legitimate 
and intentional (as when a sender defines a Bee address in their mail reader) but is 
generally abu sed fey junk mailers in order to save the processi ng time required to 
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reprocess the message header for each recipient 

m step 1491, the proxy MO! checks To, Ce> and -associated continuation lines 
for a local domain address.- The filter psrirdts Bcc messages only for trusted 
addresses, If it has not found a local f o or Ce address when it reaches the last line of 
the message header, it writes an error message to the remote host, closes the data 
connection 1403, logs the event, appends the IP address of the remote host to the 
blacklist 1406, and exits, step 1492. This appears to the MTA 1402 as an interrupted 
TCP connection. The MTA typically logs this event to the system log-file, but does not 
send the partial message to any local users. 

In steps 14954498, the proxy 1401 closes the connection 1403 from the 
remote host and the connection 1470 to the local MTA when it senses that either 
collection has been terminated. Typically, the remote host closes the connection by 
sendmg a "QtHT" message 1495 to the MTA 1402. The MTA then writes a status 
response 1497 and closes the connection 1470, The proxy 1401 senses the closed 
connection and closes the connection 1403 from the remote host 1400. 

This ends the transfer of the single SMTP email message associated with the 
current instance of the proxy \ 40 1 , and the proxy 1401 exits, in the preferred 
embodiments, multiple messages from multiple remote hosts are handled by relying on 
the proxy server's operating system 1090 (Fig. 7) to ran multiple instances of the 
proxy process, one for each message. However, other implementations are consistent 
with this invention such as* Tor example, a nmlti -threaded proxy server process that 
handles multiple messages. 

The preferred embodiment has been described as having IP filter testing 1406, 
followed by Active Diaksp testing 1420, Active Relay testing 1450 and Active User 
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teeing 1 900, However, it should be appfsesated thst these tests may be implemented 
in various other orders, Irs addition, each of these tests have individual nses } and need 
not be used io connection, with one or more of fee other teste. 

The foregoing descriptions and drawings should be considered as Illustrative 
only of the principles of the invention. The invention may foe configured in a variety 
of different manners and is not limited by the preferred embodiment. Numerous 
applications of the present invention will readily occur to those skilled in the art. For 
example, though the preferred embodiment of the invention is implemented oti a 
global network, such as the Internet, it may also he ased, for instance, in an Isolated 
Inttaoetwork or using closed networking protocols such as Novell Netware. 

In addition, in the event thai it becomes illegal to connect ip any mail server 
except to tmnsfer legitimate email, a message may be transferred to the MAIL From 
address saying "We have received your SMTP connection and are considering 
whether to accept your email message," The preferred embodiment tests for relaying 
only for non-trusted hosts who attempt to deliver mail to the local MTA. 

While the preferred embodiment uses Internet standard protocols such as 'IPv4, 
DNS, TCP, and SMTP, the invention may also be used with other networking 
protocols and network architectures, such as, for instance, IP version 6 (IPv6) or 
X.500 name services, or protocols not yet developed. Further, the invention may be 
used with other backbone MTA-to-MTA protocols such as Extended SMTP 
(ESMTP), the X-400 Message Handling System (MBS) or clieoi-to-mailhost 
protocols such, as POP orlMAP when the Active Filtering Amotions are not 
performed on the backbone. Further yet, the invention may foe used with various 
cryptographic ar^ IP Security (IPSee), 
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S/MMIE or OpehFGP standards, althoiigti -spammers are iiniikely to ose any protocols 
involving traceable encryption keys. 

The Active Filteri ng methods described is mis application cm be integrated with 
other suitable devices and/or methods to provide additional capabilities. The proxy- 
can also be configured with additional databases not described in this application to 
provide further controls or increased performance. For example, the preferred 
embodiment of the proxy does not provide for appending hosts determined to be non- 
relay or non-diahm to these respective databases. However, subject to performance 
requirements., other embodiments of this proxy might perform caching of tested 
addresses so as to avoid unnecessary re-testing. 

j^SgcJp j em Whiteiistiug and Quarantining 




Two additional features of the preferred embodiment of the invention, per-recipient 
whitelisting and quarantining, can be used by individual users of the network to 
manage their own incoming email and to retrieve messages that were rejected by 
Active Filtering. This embodiment uses the same Active Filtering (that is, Active 
Diaiup 1429, Active Relay 1450, and Active User 1900} tests as described with 
reference to Figures 7-23. 

Unlike the system described in Figures 7-23, which enforces access decisions during 
processing of the MAIL From message, these two additional features require that the 
proxy defer enforcement of Active Filtering decisions to R.CPT time. This is because 
the proxy does not know the recipients of an email message at MAIL From time. 1 is 
only when the sending MTA 1400 identifies each intended recipient (that is, by 
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sending an RCPT mess&gs) that the proxyT40I can access that recipient's whitelist 
and determine how to process the message for that recipient. For example, one 
recipient may have a whiiejist entry to receive mail, from a particular sender, while 
another recipient may choose to discard mail from that sender. It should be apparent, 
however, that pet-recipient whitelmting and . -quarantining can be performed in any 
suitable manner and at any suitable stage of mail -processing -subsequent to RCPT. 
Figure 24 illustrates the additional databases used by this preferred embodiment With 
respect to Figure 7, Figure 24 adds aa optional Per-recipimt WbiCelisi Database 1600 
for each recipient and a Quarantine 'Database 3 61 0 for storage of rejected messages, 
These databases are accessed by RCPT message processing, so a block for RCPT 
message processing was also added in Figure 24. 

The per-reeipient filtering can be flexibly configured by the recipient. First, ibis 
filtering allows the recipient to have a whitelist containing a single "@" tirat will 
match any sender's address and thus totally override Active Filtering .for that recipient 
Second, the recipient can have a whitelist that enumerates certain sending 
domains/addresses that will override Active Filtering, in this case, mail from one of 
these senders that is detected as a spam risk by Active Filtering will he permitted for 
that recipient and blocked for all other recipients not having a matching whitelist. 
Third, other recipients can operate without a per~reclpierst whitelist, in which ease the 
Active Filtering decisions will block mail in accordance with Figures 1-23, 
The Per-RCPT Whitelist Database 1600 consists of a collection of whitelist files, 
where each file p^ii».<rty.to-4.^icuI^:r©ciplmt Thfe whitelist files 1600 can be 
configured by a system administrator, or with suitable access controls, by the 
recipients themselves, so that one recipient can receive an Active Filtered message 
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from a particular sender while the message is blocked for smother recipeat The per • 
recipient whitelist .files 160G- wptsvided W #d$fi©» : toihe system whiteiist 1094, 
Accordingly , a particular recipient will receive a message if the sender is listed in 
either the system whitelist l^-w0^t«^^s-?»^e\iA 1600. 
la addition, the filter can quarantine messages that would otherwise be rejected. A 
quarantined message is received by the proxy and saved to disk storage, where It can 
subsequently be reviewed and forwarded by either the recipient or an email 
administrator. The proxy can be configured to separately enable quarantining for 
Active Relay, Active Dial up, and/or Active User rejections. For example, a proxy can 
he configured to quarantine messages that fail Active Relay testing and reject 
messages that tail Active Dialup and/or Active User tests, 
Fer~reeipknt whitelistmg and quarantining are performed separately for each 
recipient, that is> for each RCPT address. Thus, in general, an email message with 
multiple recipients may be whitelisted lor some recipients and quarantined {or 
rejected outright) for the remaining recipients. Of course, the recipient whitelist need 
not be defined and/or the quarantining need net be configured by the system 
administrator. This would effectively remove these features from the filter proxy. 
The present preferred embodiment (shown In Figures ?»23) provides a uniform level 
of junk mail protection for all recipients. That Is, the Active Filtering Proxy 1401 
determines if the message is uaaceeptahle (i.e., the remote host is an open relay, the 
remote host is a dialup, or the message has a fijrged MAIL From address), then either 
accepts or rejects the message for si! recipients. This is best suited for large 
organizations such as Government agencies or commercial businesses that have a 
consistent policy regarding junk mail and that have decided to block all potentially- 
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dangerous email. 

However, most ISPs and some busmesses have users with widely varying 
expectations regarding junk mail filtering, Some users may want the proxy to block all 
potentially dangerous mail, while other users will tolerate significant amounts of spam 
hut do not want to have a single message inadvertently blocked. For this situation, the 
Active Filtering tests (Active Relay 1450, Active Dialap 1420, and Active User 1900} 
are used in conjunction with, per-redpient whiteiists to permit individual users to 
receive mail that might otherwise be rejected by the Active Filtering tests and a 
quarantine mechanism to save email that is rejected by Active Filtering and does not 
match a recipient whitelist 

Thus, the whitelist 1600 is flexible to provide support for a large organization having 
a single junk mail policy or for an ISP that allows the individual recipients to define 
their Own respective filtering policies. The recipient whiteiists 1600 are consulted 
only if a message would be intercepted by one of these Active Filtering tests. 
Otherwise, if the remote host is not an open relay, the remote host is not a diilup, and 
the address does not appear to be forged, the proxy 1401 passes the message to the 
MTA 1 402 without considering user whiteiists 1600 and without saving the message 
to the quarantine database 1610, If a particular recipient(s) does not have a recipient 
whitelist 1600, then the proxy will deliver the message only if it successfully passes 
all of the Active Filtering tests, if an email message fails an Active Filtering test, is 
not whitelisted by all recipients, and message qoarauinhrg is configured in the 
Configuration Database 1098 (Figure 24). then the email message is quarantined for 
the recipients that do not have a whitelist that matches the sender's address. 
In accordance with the preferred embodiment of Fig, 24, each recipient whitelist 1600 
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is maintained in a separate File, located to the ^tsctoxy identified in the configuration 
database 1098, and having a asms tost ideati&s the feeipkafs email address. For 
example;, if a proxy 14(51 is an MX hoM for fc two domains escora.eom and foo, net, 
the recipient whitelist directory might have theretipieat white&t database files 1600 
o f Table 5. 

aliee@escom.com asnuth@lho.net 
hari@escom.oom cap@escom.com 
posteastet(§esco«i.cam postmaster@fbo.aet 
psnfiih@foome* rjooes@tbo.aet 

Table S - Sample Whitelist Database 1 600 
Bach of the files of Table 5 contains a sequence of one of more substring 
patiems Ibrthe recipient identified to the file name. If a user has multiple addresses 
(e,g>, alice@eseormoom and asraith@fbo.net), then there is preferably a separate file 
for each, address. 

Substring patterns are nsed to Identify senders (MAIL From addresses) that are 
permitted to override Active Filtering decisions and thus send mail to the recipient 
For example, the file aliee@escom.com might contain @sornedom ,.dom } 
@eiscwhere.net and jane@aV>e<com. Thus, MAIL From: <ge0rge@soniedom.dom> 
via an open relay to a!iee@eseom,eein wonki be permitted because the pattern 
"@somedon5.eom" is a substring of the MAIL From address. In addition, mail from 
jane@doe.com would also he aeceptM, tlmugh mail from john@doe.eorn would not. 

The proxy 1401 also has a system-wide wMtsIist file 1094 that pertains to all 
recipients. As previously discussed with reference to Figure 1 5 f the MAIL From 
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address is whitehsted only jfif is an exact match of an entry in the whiteiist database 
1094. However, the system whitelist 1094 and recipient whltelkt 1600 entries are 
preferably tested as substrings of the MATL From address to determine if it is a match. 

It should be noted that ihet^ipt^twMtefists 1600 do not override blacklisted 
IP addresses. If a remote host is blacklisted, as shown in Figure 14, the proxy closes 
the connection from the remote host without proceeding to the exchange of MAIL 
From and RCPT To messages. 

If & message is rejected by Active Filtering and the proxy is coallgured tor 
quarantining, the proxy will save the email message to a separate file in the quarantine 
directory. The quarantine directory is specified in the Configuration Database 1098, 
for example^ /var/spooi/asmtp/QD. The proxy assigns a name for each quarantine ; 
message ri Ie y such as including the characters \f\ the numeric month, day* hour and 
mihute s and the process ID for the proxy process. For example, the complete 
patlmame for a quarantine file might he: /var%ool/asm5p/QD/qfO305 !4S9~214SL 

'Each quarantine file contains the remote host's name and IF address, the MAIL 
From address, ai least one RCPT To address, a DATA line, and the text of the 
message as received from the remote host. For example, the first few l ines o f a 
quarantine file might contain.: 

connect hosb :: nwury 4 somewhere.dom/l 92.168. 100.10 

MAIL From: <sender@sornewhero,doffl> 

RCPT To: <bbb@loealdom> 

.RCPT Tot <eaToi@!oc8hdom> 

DATA 

'Reoeiveat.:^mj^»ote4oth -Cio^iem^e.dotn [192.I6S.2S5.2S5],,, 
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The proxy creates quarantine files when required for a message matching the 
quarantine criteria. The quarantine files remain in the quarantine directory until they 
are manually- removed by an administrator retrieved by ail -recipients, or automatically 
removed by an operating system command (mch m the UNIX' eron command) that 
finds and deletes tiles of a certain age. For example, an administrator may periodically 
review the quarantined messages in order to delete those messages that contain spam 
or objectionable material and forward (and then delete) those messages which are 
relevant to the organization's business. 

The proxy itself does not provide any capabilities for managing quarantine files, 
but supplementary tools permit administrators and users to forward quarantined 
messages .from storage on the proxy host to the user 's mailbox on me MTA. For 
ex ample, an administrator can run a utility program (qadmin) on me proxy host to 
forward a quarantined message to the MTA, where it will be delivered to the 
recipients listed in the quarantine -file. A network user can run a quarantine client (qo) 
program to list messages for that recipient, and to forward selected messages tor that 
recipient to the user's mailbox on the MTA. 

Not all rejected mail is quarantined. The Configuration Database 1098 contains 
settings to separately control quarantining of Active Relay rejections* Active Dialup 
re?ections, and Active User rej ections. Thus* if an organization has a policy of not 
permitting any SMTP Diret*'«8»^l,;^to.^pJPOxy can be configured to reject Active 
Dialup messages without quaraotlmog, while qoarantking messages rejected by the 
Active Relay and Active User tests, 

Ooe disadvantage of quarantinmg Is that it requires disk storage on the proxy 
server .host When a message is quarantined, the proxy makes a local copy- of the 
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message on the proxy's fife system. To prevent the proxy from mnning out of disk 
space s administrators must provide & large pool of disk storage, must periodically 
monitor disk usage, ami must provide a means for removing quarantined messages 
alter some site-defined quarantine period. One possible threat would be For an attacker 
to attempt to fill all the available disk storage by repeatedly sending large messages 
that would fee quarantined by the proxy. This threat is reduced somewhat by automatic 
blacklisting o f a remote host after it transfers the first message, but the attac k could be 
modified to relay a large message from many different open relays, for instance. A 
related disadvantage of quarantining is that if the quarantined message has malicious 
content, the message may he retrieved by a user who will further spread the virus or 
malicious content. This risk can be ameliorated somewhat by using virus filtering 
toots to scan, the quarantine directory for malicious content 

Ho wever, these disadvantages are generally considered to be outweighed by the 
advantages of having copies of email rejected by the Active filtering tests.. One 
advantage of quarantining rejected email is that it provides the administrator with a 
ready database of all collected junk mail for potentially pursuing legal action against 
high-profile spammers. The administrator does not have to retrieve this evidence From 
individual users in order to pursue legal action. 

The operation of the preferred embodiment of Figure 24 will now be discussed 
with reference to the flow charts of Figures 25-28. The same Active Filtering 
methods (Active Relay ]45D~146S, Active Dlalup I420-I423* and Active User 1 901 - 
1913 tests) are used as described with reference to Mgwes M, IS and 1 9. However, 
enforcement of the Active Relay, Active Diahip f or Active User test results are 
deferred until each RCPT message is received. The RCPT message identifies each 
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intended recipient, wMeh permits the proxy to find the whhelist 1600 (if it exists) to 
be used for that recipient Coi^eqy^y^ij^.both/p^^Jp'uaji whitelist and message 
qn&raMfomg this alternate embodiment involves changes to the proxy 1401 for M AIL 
From (Figure 25), RCFT To (Figures 26-27) and DATA (Figure 28) message 
processing. 

In the preferred embodiment, fee proxy provides mode flags that can foe set by 
m administrator so (hat the proxy wilt perform only the selected Active Filtering tests. 
If a particular mode flag (e.g., modedial) is not set, the proxy will not perform that 
particular test for any sender and, thus, will not reject any message for that reason. If 
all of the mode flags are set, the proxy will preferably perform the tests in the order 
shown in Figure 25. In other alternative preferred embodiments^ the proxy cat 
perform the Active Filtering tests in other orders. 

In Figure 25, steps 1413 - 1419, the proxy 1401 attempts to open a reverse test 
connection 141$ to the remote host 1400. The proxy 1401 then performs Active 
Relay testing (I450-1463) if the connection is successfully opened, or performs 
Active Dial up testing (1420- 1423) otherwise. Finally, the proxy 1.401 performs 
Active User testing (1901-1913) if the results of Active User or Active Dial up testing 
are inconclusive (e.g. , protocol failure), or if fee remote host is Pot identified as a 
dialup 1602. 

The proxy 1401, however, does not immediately en force the Acti ve Filtering test 
results by rejecting any message that fails foe -relevant Active Filtering tests. Instead, 
the proxy 1401 checks for an open relay 1501, diaiup 1602, or forged address 1603, 
and then sets a reject flag in step id04 if it feds any of these conditions. The proxy 
also reads quarantine flags (step .1605) from the Confirmation Database 1098. There 
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is a separate quarantine fiag;for^fe-ofsfe<5.lte'ee- : a^Jts of Active Filtering, thus 
permitting Active Relay rejections to foe ^yaraatmesl, for example, while rejecting 
outright Active Oiaksp and Active User failures. Finally, the proxy then sends the 
M AIL response message to the remote host m step 1606, 

Figures 26 and 27 illustrate the steps involved for the proxy 1401 to process a 
single RCPT message 'fiom the remote host The remote host may send multiple (n) 
RCPT messages, up to the maximum number of recipients permitted by the proxy. 
This is indicated symbolically as p-l -maxrept, where the maximum number of 
recipients, maxrept, is defined in the configuration database 1098. In step 1630* the 
remote host 1400 se nds the RCPT message with the email add ress of the Intended 
recipient, shown symbolically as <rept-n>. 

At ..step 1631, the proxy determines i f the message is trasted, either (a) because 
the remote host matches an entry m the trusted database %i&B or (b) because the 
MAIL From address matches an entry in the system whiteh'st 1094. In either ease, the 
proxy proceeds to transfer the RCPT message to the MIA, beginning at step 1 637. 

If the message is not trusted* the proxy determines if it has set the reject flag 
1604 to indicate an open, relay, dialnp, or forged user address. If the Hag is not set, 
then Active Filtering found no problems, and so the proxy proceeds to step 163? with 
transfer of the RCPT message. 

Consequently, as a result of steps 1631 and 1632 s any remaining messages are 
nntrnsted and were flagged by Active Filtering as a junk mail risk. These are 
messages that would have been rejected by the system in accordance with the Active 
Dialnp, Relay and User tests, as shown in Figure 7, 

At step 1633 the proxy attempts to open the recipient's whitejist file 1.600. This 
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consists of getting the recipient wfcuehst directory fern ihe Cimfig^ratioa database 
1098, appending a "/" character to separate the directory and filename, then appending 
the RCPT address. For example, if the recipient wfeifeUsi directory is 
/var/spool/asmtp/OW (where OW is Usm Whitef sts) and the recipient is 
, 'asmith@foo,net" < ten the recipient whitMist file is 

/var/spool/asmtp/UW/asmjth@foo.«ct The recipient address is used as Ins same of 
the whitdist file because it is unk|us for each possible recipient address and heeaose It 
simplifies file management While this scheme does not permit grouping of multiple 
addresses with a single profile, it is understood that a separate table lookup 
mechanism could be used to map multiple recipient addresses to a single filename. 

At step 1634 the proxy checks the Mum status Ixora the file open. The most 
iikelj reason for a failure is thai die recipient does not have a recipient whiielist file 
1600, A wMtehst file is not required for every recipient, and the whitetisi .file is 
created at the option of the recipient or system administrator. Users without whitelists 
receive email thai Is accepted by the Active Dialup, Relay and User tests of the Active 
Filtering proxy. This includes mail from trusted sources and non-relay, non-diaksp 
mail from legitimate addresses. 

If the recipient whitelist file j 6()0 is not avaiiahie. the proxy proceeds to step 
.1 650 where it checks the quarantine flags to determine whether to reject the message 
for this recipient or to add the reerpiejn to a quarantine file. 

If the whiteiist open was suecessad, at step 1635 the proxy reads entries in fee 
whiteiist file to determine if any wiutelist entry |s : a substring of the MAIL From 
address. For example* iftheMMLFrom address was ! 'geerge@sooiedom.dom i ' then 
any of the ibllowihg paitems vvonld match that address: @, geoige^, george, 
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{S'somedom.doni, $omedom.dom s est george^somgdom *dom. 

The "@" pattern is a special case that will match any MAIL From, address, 
because the proxy requires all MAIL From addresses to have m "@ !! . The 
"@somedom.doni" pattern m!l:^IdW'iBg,i«^>t«Jtt» receive mail from any user at 
somedom.dom, while the "george@somedx>m Mm" pattern matches only the specific 
sender. 

If any pattern matches the MAIL From address, then the proxy transfers the 
RCPT message to the MTA beginmag at step 1637. Otherwise, the proxy proceeds to 
step 1650 to determine whether to reject or quarantine the message for that recipient, 
lop- the R CPT status (not shown), issues an appropriate status response* and waits ror 
further input from the remote host. 

At step 1.637,. the prosy checks if this recipient is the first -authorized recipient 
for this message. If so, the proxy connects to the MTA as shown m step 040, sends 
the HELO message received earlier from the remote host, and sends the MAIL From 
transaction received earlier in step 1413 (Figure 24). If any error occurs (not shown), 
the proxy closes the data connection 1403 from the remote host, logs the status* and 
exits. 

The proxy sends the current RCPT message to the MTA in step 1646, waits for 
the MTA response, and sends the MTA response to the remote host in step 1648. 
Providing an actual MTA response to dre remote host is important because the proxy 
does not know which users actually have Mailboxes on the local MTA. The proxy 
knows if a recipien t has a whitelist, but ah users do not necessarily have whiieHsi 
files. Consequently, the proxy may accept a . message for a recipient that does not have 
a mailbox on the MTA. hit this'^e^t^he/pmxyshbMd-'^tum the MIA* s error 
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message to the remote host. 

Confirming with steps 1650 and 1651 on Figure 27, the proxy cheeks the 
quatftnime flags (from step 1605, Figure 25) to determine sf the email message is to be 
rejected or quarantined for the current recipient The proxy performs this test only 
after €eterm?nmg that the .message is not trusted (step 1.63 1 ), that the reject lag was 
set by Active Filtering (step 1632), and that the current recipient does sot have a 
whitetist entry that matches the sender (steps 1632-1635). 

The quarantine ilags are stored m the Configuration Database 1098 and retrieved 
after determining that the message failed an Active Filtering test in Figure 25. 
Individual flags are provided for each of the Active Filtering tests so that individual 
types of rejections can fee independently enabled or disabled for quarantining. These 
ftags pemiit organizations to configure the proxy to quarantine some types of ri sky 
messages (e,g, } Active Relay failures) while rejecting outright o&M tyoes of messages 
(e,g, Active Diaiup and Active User failures), according to the local organization's 
policies. While quarantine flags are defined, only tor the various Active Filtering 
modes, other embodiments might provide additional ilags for various conventional 
spam-Altering rejections, such as a non-existent MAIL From domain. 

A message is quarantined for a recipient if the quarantine flag is set for the 
corresponding Acti ve Filtering rejection. For example, if the remote host tails the 
Active Relay test and the Active Relay quarantine flag is set, the proxy will perform 
quarantine processing beginning at step 1652, Otherwise, if the corresponding flag is 
not set, the proxy rejects the.:mfes^tg(e:&r-thi$: : teei^ifiat by sending a 550 response at 
step 1655. 

At step 1652 the proxy checks whether the quarantine file was opened for a 
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previous recipient if not the proxy attempts to create a new quarantine file for the 
message at step 1653, TMsiuvolves retrieving the qnarantise directory name from the 
configuration fife 1098 and appending a unique name for the file within that directory. 
The unique name is con$truet^;by e<»eai^tmg thestoeric month, date, hour and 
minute and the araerk ID of the proxy process along with other fixed characters 
suitable tor identi fying a quarantine file. For example, if the quarantine directory is 
"/var/spool/asmtp/QD 5 ', the current date and time is March 5, 1 4:15 and the proxy's 
process ID is 15:1 1 3, the pathname for the quarantine file would be: 
/var/spool/asmtp/QD/qffi30514?5-lSl 13. 

At step 1654 the proxy checks the return status .from the tile creation request. If 
the proxy cannot create the file, for example, because it already exists or because of 
exceptions such as insufficient disk space or inadequate access rights, the proxy sends 
a 5 50 rejection to the remote host tor the current recipient. If the quarantine ille 
creation is successful, the proxy appends SMTP control information, including the 
remote host's name and IP address, and the MAIL From address at step 1656, 

Steps 1653-1656 are preferably performed only dor the first recipient to be 
quarantined and not for successive recipients. The first recipient to be quarantined 
may or may not be the first recipient; For example, if the first recipient has a whilelist 
for the enrrent sender then the message wonld be accepted for that recipient If the 
second recipient then did not haw a matching whiteiist for the sender, the quarantine 
file would then he opened for the second recipient. 

At step 165? the proxy appends the current recipient address. This occurs for 
each recipient that is not whitelistsd or accepted for some other reason. For example, 
if a message has ten recipients, four of which are whitelisted, then the remaining six 
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recipients would be appended to the qaara»tfee fik: 

The proxy then sends a "250" response to the remote host for the current 
recipient In summary, Jive proxy sends & response at step 1658 if the message is 
quarantined for this recipient, or a "550" response at step 1655 if the message is not 
quarantined. This might occar i f the quarantine Hag is act set (step 1651) or if the 
quarantine file could not he created (step 1654). 

This completes processing of a single RCPT transaction 1630 from the .remote 
host 1400 to the proxy 1401. The recipient may have been, accepted or rejected by the 
MTA at step 1648; rejected by the prosy at step 1655; or accepted for quarantining by 
the proxy at step 1658. 

Aier sending the MTA's RCPT response to the remote host, the proxy waits for 
fhriher input in accordance with the SMTP protocol. This may include additiona! 
RCPT transactions (up to maxrcpt), a DATA message, or a QUIT message. The 
maxrept value is retrieved from the Configuration Database 1098 as part of overall 
initudteation of the proxy. If the remote host attempts to send more than maxrept. 
RCPT messages, the proxy .rejects the additional messages with a suitable error status 
(not shown). In accordance with the SMTP protocol, a sending MTA will send a 
DATA message when, it has successfully transferred at least one RCPT, or a QUIT 
message if no RCPTs w^re accepted By the receiving MTA. 

Pi gore 28 shows the steps invol ved in processing the DATA transaction and 
subsequent text of the message (incMding ahe message header, body, and. attachments, 
if present). The addition of per-recipient whitciisting and quarantining in this 
aheTisative preferred embodiment involves additional considerations in addition to 
those shown in Figure 2i. 
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First of al1 5 because o.fp»~teclpi«at whitofisiing, a message cm be directly 
trans! erred to one set of recipients and simultaneously rejected or quarantined for the 
remaining reespieMs. In the original preferred embodiment, a message that failed 
Active Filtering was rejected for all recipients* However, because of per-reclpiem 
•whitelistiag, the proxy must provide for splitting the message text from the remote 
host into two identical parts,, with one pan being seat to the MTA and the other sent to 
the quarantine file. The second complication is that there are additional data paths 
necessary to handle messages that are rejected outright (feat is, with no collection of 
data) .and messages that are saved to a quarantine file. 

At step 1661 the proxy determines i f the message was accepted for at least one 
recipient All mail from trusted hosts, hosts that successfully pass Active Filtering, 
and mall authorized by recipient whitehsts follows this .flow of processing. 

If the email message is authorized lor at least one recipient, then the proxy 
transfers data between the remote host and the MTA as shown in Figaro 21 (steps 
1485-1498). That is, the proxy recei ves the DATA command, message header, 
message text, and the "V* and QUIT commands item the remote host and transfers 
them transparently (except for Bee .filtering) to the MTA. Where responses are 
required by the SMTP protocol, the proxy accepts the response from the MTA and 
forwards U back to the remote host. 

In addition, the proxy also checks for the existence of an open quarantine rile 
each time it reads a line of text from the remote host, if flic quarantine file is open 
(step 1.675), the proxy duplicates the data stream and writes the line to the.qaaran.tine 
file in step 1 676, Otherwise, ifthe ouaraatine file is not open, then processing is the 
same as shown in Figure 21. 



ss 

With respect to step 1661, if the surf ent message is'soi authorised for my 
recipients, then the proxy has not opened a:<^n^ectk>» to the MTA, This is the flow 
•that is followed for most jimk mail, where Active Filtering rejects the message and the 
message is not authorized by any recipient w&Mists. 

if the remote host sends a QUIT or a DATA message with no valid recipients, 
this indicates that the message was identified as a junk mail risk by the Active 
Filtering tests bat that no recipients bad a white-list thai matched the MAIL From 
address, 

M step 1 662, the proxy appends the IP address of tire remote host to the blackli st 
database 1095, The prosy checks for the existence of the autobiackiist flag in the 
configuration database 1098. if the flag exists, the ptoxy appends the IP address So 
the blacklist, otherwise the IP address is not added to the database. 

In step 1663 the proxy checks for existence of an. open quarantine file< If the 
quarantine hie exists, the proxy appends the DATA line (1665) and sends a response 
(1666) in accordance with the SMTP mx>foeol Otherwise, If there is no qaaj-antine 
file, then the proxy sends & 550 error response to the remote host, logs the event, a«d 
closes the data connection I 403 5 as shown In step 1664, 

if a quarantine file exists, the proxy waits for the remote host to send the 
message header and text, one line at a time, as shown In Figure 21, steps 1488-1495, 
At step 1667 the proxy ri^.#oli:lH*g-:^jm':the remote host. At step 1668 tire proxy 
compares each line of text with the SMTP eod-o^message indicator (a period on a 
line by itself). If the received line of text is not an ond-of-message, the proxy appends 
the line of text to the quarantine file, Optionally, not shown, the proxy may perform & 
maximum length check at this point: to help deter attempts by attackers to use up all 
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the proxy's disk space. Olh^.^if : &«':<^^tl|Re^fiidkaaes-the end of message, the 
proxy closes the quarantine lie (step 10% handles the closing SMTP protocol with 
the remote hast, closes the data eonsectfon, and exits. 

If the proxy finds & mMebmmmymlk&mnB^xs^Qsx database 1098, it 
performs Bcc testing as shows la figure it, .14884492. As the proxy scans each line 
of the message header, it looks for a local domain same in To:, Co:, or continuation 
hoes, If the message is not whitelisted or otherwise trusted, and a local domain was 
not found* the proxy rejects the message and closes the connection. 

One o f t he advantages of quarantining rejected messages is that the message is 
available for review by administrators or by intended recipients, and can be forwarded 
to ike MTA If it h a legitimate message. Various methods can be used to access the 
quarantine database, including the methods shown in Figure 29 for forwarding desired 
messages to the MTA, 

As shown in Figure 29, the proxy server includes the Active Filtering Proxy 
program ! 104 that handles incoming SMTP connections from remote hosts on the 
Internet 1 100, If a message meets the quarantine criteria described earlier, the proxy 
stores the message in the Quarantine Database 161 0 and optionally may append the W 
address of the remote host to the Blacklist Database 1095, 

The system shown in Figure 29 provides two methods for retrieving quarantined 
messages. An administrator can run the Quarantine Administrator (qadmin) utility 
I 68 I on the proxy server host to forward a quarantined message to all recipients listed 
in the quarantine file. Users can run a Onarantine Client (qc) program 1684 on a 
workstation 1683 or MTA 1402 to list and retrieve their quarantined messages. 

The qadmln program 1681 opens a quarantine hie (specified as a calling 
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argument), gets the IP aMtegs and port ntaUher "of the MTA from the Configuration 
Database 1098, and establishes a TCP connection J6&2 to the MTA. It sends the 
MAIL Frons, RGFF Tot, and DATA commands fmm the quarantine file to the MTA, 
then reads the rest of the message from the quarantine file and transfers it ilne-bydine 
to the MTA. After successfully transmitting the message, the qadrnia program, 
removes my entries in the blacklist database 1095 that match, the remote host's IP 
address, appends the sender's MAIL From address to each recipient's user whiteist 
1600, and removes the quarantine file from the quarantine database 1610. 

A user can run the Quarantine Client (qc) program 1684 on any workstation or 
server. This program permits the user to list all messages in the quarantine database 
1610 that were addressed to the user and to forward selected messages hrom that list to 
the MTA 1402. The qc client creates the user address by getting the user's login name 
from fee operating system and appending the local, domain name. The Quarantine 
Protocol 1685 preferably provides cryptographic information (e.g^ credentials) in 
addition to the user's email address or a log in by the user to the qs server in order to 
prevent spoofing of other users' email addresses. 

The qe program opens a TCP connec tion 1685 to the proxy server host 140! and 
interacts with, a Quarantine Server (qs) program 16S6 to access the user 's quarantine 
.files. The user ean request qc to provide a listing of all quarantined messages 
addressed to that user. Each line of the report represents a single quarantine message: 

Name From Subject 

qf030S 1459-2 1481 se.nder@somewhere,dom Re: computer bid 
qfiB05 1522-21 552 dude^elsewhefejp MAKE MONEY FAST? ? ! ! 
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The file name i4^M^^ im&m^ timQ o£^mQ&&&g& (e.g., 03051459 is 
March 5 at 2:59pm} f the MAE, Prom fKidrsss, and the Subject Ible from, fee message. 
The qs server gets the quarantine direetw>ri-<>m the Gosfigaration Database 1098 5 
reads the directory entries, and opens each file in the directory searching for a RCFT 
Une listing the exact user asldress sent by the qeehent For each match, it returns the 
.file name, sender, and Subject to the eltent, which .formats the output and caches the 
information for subsequent listings. 

Based upon the information in the listing, the user may choose to forward a 
particular message from the quarantine database 1610 to the MTA 1402. For 
example, if the user requests the message qf0305 1459-2! AS 1, fee qc client 1684 sends 
fee file name to qs 1686, which verifies that the client's email address is listed as a 
recipient of fee message, opens fee file, establishes a connection 1 §8:2 with feeMTA, 
and transfers the message to the MTA. The MTA then queues the message to fee 
user's mailbox. 

After successfully transmirdng the message, the qs program removes any entries 
m the blacklist database 1095 feat match the remote host's IP address and appends the 
sender's MAIL From address to the current user's whitelist 1.600. if the current user is 
fee final recipient of the message, the qs program removes the quarantine file -from the 
quarantine database 1610, Otherwise, it removes the RCPT entry for fee current user 
and writes the quarantine file hack to storage. 

The system (both the retrieval ptogaim at^d prc«y) provides adaptive filter 
management Retrieval ef a quarantined message automatically configures the 
blac kl ist and recipient whitelist to psnsiit subsequent messages from the sender of fee 
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quarantined message. Is fee preferred embodiment > retrieval of a quarantined 
message causes qadmin or qs to check fee blacklist database 1095 and automatically 
removes any filters matching fee IP address: The rationale for this Is feat If the 
message is legitimate (aoiHspam), then fee remote host that sent fee message should 
not be blacklisted. This design distributes responsibility to the users of the network 
for reviewing and forwarding quarantined messages and, thus, removing blacklist 
ersiri.es for legitimate hosts. 

In addition* retrieval of a quarantined message preferably automatically results 
in the sender's MAIL From address being appended to the recipient's nser white&t 
1600. For example, a remote open relay host can send a sequence of messages to 
local users. As a message arrives, fee proxy blacklists the remote host because it is an 
open relay; Subsequently, if a user retrieves the message, the retrieval program 
(qadrrah or qs} removes the blacklist entry 1095 for the remote host arid appends the 
sendees MAIL from address to the user's whitelist 1.600. Addition of the whitelist 
entry matching the sender's address automatically prevents farther blacklisting, even 
though the remote host still fails Active Dialup (1420- 1 423), Active User 
(1 901 491 3), or Active Relay (1450- 1465} teste. The addition of a whitelist entry 
matching the sender suffices to prevent farther blacklisting. Thus, the entire process 
o f detec tion , blacklisting, quarantining, blacklist removal and acceptance of the sender 
adapts to the reactions of local users and is performed without administrative 
involvement. 

Other alternative embodiments include web-base4 or email autoresponder 
approaches for retrieval of quarantined messages. For example* with a web-based 
approach, the pser could run a web browser feat accesses a HTTP server and a 
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database application on the proxy to retrieve listings from the quarantine database. 
With this approach, the user eaii list m& forward quarantined messages by click ing a 
search button, or can read messages via the browser interface. I» this ease, die 
retrieval server should preferably fa.) repair© the use? to log in to the server process 
ami (b) restrict access by a user to only the messages where the user is listed as a 
recipient of the message. Otherwise, an inquisitive user would be able to browse 
through all quarantined messages, even those addressed to other users, by simply 
guessing at file names . 

This preferred embodiment describes retrieval of messages q uarantined by an 
Active Filtering proxy server. However, the automatic removal of blacklist entries, 
and the automatic addition of a user whitelist entry following retrieval of a 
quarantined message is not limited to Active Filtering embodiments. The automatic 
removal of blacklist entries during retrieval of a quarantined message east be used In 
any filtering embodiment having a blacklist, quarantine storage, and a quarantine 
retrieval mechanism. Similarly, the automatic addition of a user wbitelist entry during 
retrie val of a quarantined message can be used in any filtering embodiment having 
per-reeipknt white! Isis, quarantined storage and a qtjaraatme retrieval mechanism. 

Though the preferred embodiment uses separate wbitelist tiles, other 
embodiments can provide the same general capability, for example, use of a single 
database containing authorized (sender, recipient) pairs, use of wildcards, or use of 
accept/deny authorizations as are typically used in access control lists. 

Retrieval of -a quaranhned message auiomatieally updates the user's respective 
wbitellsi 1600, In addition, a user can also edit bis or her wbitelist by making add, 
delete,, or list requests to the Quarantine Client {^c}p»gsani 1684. The qc program 
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sends the edii requests to the Qtsaradtfee Server program 1,686, which edits the 
whiielkt file for (fee riser. 

Though the wltitehst database 1600 and quarantining have been described for 
use with the combination of Active Dialup, Active Relay and Active User tests, the 
whheSistiog and quarantining can he used with any single test, or independently, to 
addition, the whitelisting and quarantining can be used together or separately. 

Other mechanisms cm also be incorporated into the proxy filter, such as a check 
for domain existence and a content match, The domain existence function cheeks for 
the existence of the MAIL From domain. The content match checks for keywords m 
the MAIL From address. If the message contains any word in this list, the message is 
rejected. 

Though the in vention has been described for current spammmg techniques, such 
as specific relaying, diahtp and user -methods, the invention should not be construed as 
limited to these current approaches. The invention can he implemented to address any 
developed relaying, diatup and user (forgery) attacks* and can include any suitable 
per~recipieut whitelisting and/or quarantining. Therefore, it is not desired to limit the 
invention to the specific examples disclosed or the exact construction and operation 
shown and described. Rather, all suitable modifications and equivalents may be 
resorted to, falling within the scope of the invention. 
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U A system for selectively accepting as electronic -message sent from a sender 
through a remote host 11400} over a eonneehon to a recipient at a server, the system 
comprising a &Mup fitter (1420) determining whether the connection is a diaiup 
connection and accepting titer ei^^nic^m^a^^tfes-eoniiecUoB is determined to not 
be a dialop connection. 

2. The system of claim i, wherein the connection is determined to be a diaiup 
comieetion if the remote host is a diaiup, 

3. The system of claim t, Iptther comprising a recipient whiielisi database (iSOO) 
including a. list of acceptable sender addresses .for the recipient, wherein said system 
accepts the electronic message for that recipient if the sender address is in said 
recipient whiteiist database (1600). 

4. The system of clai m 3, wherein if the sender address is not in said recipient 
whiteiist database (1600), then the diahm filter determines if the connec tion is a 
diaiup connection, 

5. The system of claim. 3, wherein if the diaiup filter determines that the connection 
is a dialnp connection, then said system accepts the electronic message for that 
recipient if the sender address is in said recipient whiteiist database (1600), 

0, The system of claim 3, wherein saM system rejects the electronic message for thai 
recipient if said diainp M^det^t^'thattKe.#»«#^<m is a dialop connection, the 
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sondsr address is mi m said recipient wbitelisl database {1600} and the diatup Slier is 
not flagged for quarantining. 

7, The system of claim 3, forther comprismg a blacklist database (1095) having a list 
of blacklisted network addresses, wherein the remote host is added to the blacklist 
database If the connection is determined to he a diamp connection and the sender i s 
not .matched in any recipient wMteiist database (1600). 

8. The system of claim \ further comprising a quarantine database (1610), wherein 
if said dialup filter determines that the connection, is a dialup connection, the sender 
address is not in said recipient whitelist database (1600) and the dialup filter is 
flagged .for quarantining, then the electronic message is quarantined for that recipient 
In said quarantine database. 

9, The system of claim 8, thrther comprising a blacklist database (1095) having .a list 
of blacklisted network addresses, wherein the remote host is added to the blacklist 
database if the connection is det ermined to be a dial up connection and the sender is 
not matched in any recipient whiieiist database (1600), and wherein when the 
quarantined electronic message is retrieved from the quarantine database, the 
blacklisted network address for that remote host is removed from said blacklist 
database, 

10. The system of claim g, wherein when the quarantined electronic message is 
retrieved from the quarantine database, the sender's address is added to the recipient's 
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whitelist database (1600). 

1 1, The system of claim 1, said dialnp ; niter determining whether the connection h a 
dialup connection based upon a remote host name and a name for at least one host 
neighboring the remote host 

1 2. The system of claim I, wherein said system attempts to establish a reverse 
connection from said system to the remote host, wherein if the reverse connection 
cannot be established then said dialup filter detemuses whether the connection is a 
dialup connection. 

13 , The system of claim 12, further .comprising a relay filter, wherein if the reverse 
correction is established then said relay filter determines whether the remote host is 
m open relay. 

14. A system for selectively accepting an electTonk message sent from a sender 
through a remote host. (1400) to a recipient at*. server, the system comprising a relay 
filter (1450) determining whether the remote host is an open relay and accepting the 
electronic message if the remote host is not m open relay. 

1 5 , The system of claim 14, rbriher comprising a recipient white! 1st database { 1 600) 
incl uding a list of acceptable sender addresses for the recipient, wherein said system 
accepts the electronic message for fi*at recipient if the sender address is in said 
recipient whitelist database (loOO). 
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16, The system of claim -IS, wherein said system rejects the electronic message lor 
that recipient if said relay fdter determines that me remote host {1400} is an opes 
relay, the sender address is not in said recipient whiteiist database (1600) and the relay 
filter is not flagged ibr quarantining-. 

1 7, The system of claim ! S, further comprising a quarantine database (1610), 
wherein If said relay fi lter determines that the remote host is an open relay, the sender 
address is not in said recipient whiielsst database { 1600) and the relay filter is flagged 
for quarantining, then, the electronic message is quarantined for that recipient in stud 
quarantine database. 

IB, The system of claim 17, farther comprising a -blacklist database (1095) having a 
list of blacklisted network addresses, wherein the remote host is added to the blacklist 
database if the remote host is determined to be an open rel ay and the sender is not 
matched in my recipient whitehst database (1600), and wherein when the quarantined 
electronic message is retrieved from the quarantine database, the blacklisted network 
address for that remote host is removed from said blacklist database. 

19. The system of claim 17, wherein when the quarantined electronic message is 
retrieved from the quarantine database, the sender's address is added to the recipient's 
whiteiist database (1600). 

20. The system uf claim 1.5,. Further comprising a blacklist database (1095) having a 
list of blacklisted network addresses, wbensn the remote host is added to the blacklist 
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'database- S f the remote host Is determined, is be an 0pm relay and the sender is not 
matched i« any recipient whitchst database (1600), 

21 . The system of claim 14,- said relay filter establishing a reverse connection from 
said system to the remote host and initiating a test transaction to the remote host from 
an unrelated domain, the test transaction being addressed to a test address at a domain 
that is unrelated to the remote host, and said relay filter determining that the remote- 
host is m open relay if the remote host accepts the test transaction. 

22 . The system of claim 14, wherein a test electronic message is addressed to the 
sender and said relay filter determines that the remote host may be an open relay if the 
remote host rejects the test electronic -message. 

23. A system for selectively accepting an electronic message sent from a. sender at a 
sender's domain through* remote host < 1 400) to a recipient at a server, the system 
comprising a user filter (1.900, 1901) verifying whether the sender of the electronic 
message is authorized by the sender's domain and accepting the electronic message if 
the sender of the electronic message is verified as being authorized. 

24, The system of claim 23, Itntner comprising a recipient whitehst database (1600) 
including a list of acceptable send^-yci^es^M ^recipient, wherein said system 
accepts the electronic message fer that recipient ifthe sender address is In said 
recipient whiielist database (1600). 
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25 , The system of claim 24* wherein said system rejects the eteteie message for 
thai recipient if said user filter determines that the Sender is not authorized, the sender 
address is not in said recipient whitelisl database (1 600) aad the dialop filter is «ot 
flagged for quat^ntining. 

26. The system of claim 24, fiuther comprising a quarantine database (1610), 
wherein if said user filter determines t'hat the user is not authorized, fee sender address 
knot in said recipient wMtehst database (1600) and the user filter is .flagged for 
quarantining, then the electronic message is quarantined for that recipient In said 
quarantine database. 

27. The system of claim 26, further comprising a blacklist database (1.005) having a 
list of blacklisted network addresses, wherein the remote host is added to the blacklist 
database if the sender is determined not to be authorized and the sender is not matched 
in any recipient whllehsi database (1600), and wherein when the quarantined 
electronic message is retrieved from the quarantine database, the blacklisted network 
address for that remote host is removed from said blacklist database. 

28. The system of claim 26, wherein when the quarantined electronic message is 
retrieved from the quarantine database, the sender's address is added to the recipient's 
whitelist database (1600), 

29. The system of claim 24, farther comprising a blacklist database (1095) .having 'a 
list of blacklisted network addresses, wherein the remote host is added to the blacklist 
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database if the sender is determined not .to be amhdri&ed and the sender is not matched 
m any recipient whiteHst datehase (1608), 

30. The system of claim 23 , wteemihe user iher establishes a test connection to a 
rn.ail.host that is configured for a sender's domain, arid initiates a test transaction to the 
sender's address, said user filter detenrmriog that the user's address is not authorized 
if the configured mail host does not accept the test transaction. 

31. A system for selectively accepting m electronic message sent from a sender 
through a remote host (1400) to a recipient at a server the system comprising at least 
one filter (1420, 1450, 1900, 1901, 1491) that determines whether the electronic 
message is uMesiraMe, and a quarantine database (1610) tbr quarantining the 
electronic message for that recipient If said at least one filter determines that the : 
electronic message is undesirable. 

32. The system of claim 31, wherein said quarantine database only quarantines the 
undesirable electronic message fi>r the recipient if said at least one filter Is flagged for 
quarantining, otherwise the eleetronie message determined to be undesirable is 
rejected for the recipient 

33. The system of claim 32, further comprising a recipient whitelist database (1600) 
including a list of acceptable sender addresses tor the recipient, wherein when dsn 
quarantined electronic message is retrieved from the quarantine database the sender's 
address is added to the recipient's whlteiist database (1600). 
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34. The system of claim 33, fairer comprising a hi aekltst database (1095) including 
s list of blacklisted remote hosts, wherem if Sie electromG message }s determined to 
be imdesirahle, ami the sender address k not in any recipiem whitelist database 
(1600), then the remote host is added to said blacklist database. 

35, The system ofelalro 32, fertheir comprising a blacklist database (1095) including 
a list of blacklisted remote hosts, wherein if the quarantined electronic message is 
retrieved from said quarantine database, then the blacklisted remote host is removed 
from said blacklist database. 

34 The system of claim 3 1 , farther comprising a recipient whitelist database (1600) 
including a list of acceptable sender addresses for the recipient, wherein the electronic 
message determined to be undesirable is accepted by the system if the sender address 
for that electronic message is matched in any recipient whitelist database {1609), 

37, A system for selectively accepting an electronic message sent from a sender 
through a remote host (1400} to a recipient at a server, the system comprising at least 
one filter (1420, 1450, 1900, J 901, 1491} that detaaiaes whether art electronic 
message is undesirable, and a recipient whheKst database (1 600) including a list of 
acceptable sender addresses, wherein 'the -system accepts the electronic message for 
that recipient if die sender address Is in said recipient whitelist database (1600). 

38, The system of claim 37, forther^ 

wherein if said at least one filter determines that the electronic message is undesirable 
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and the seeder address is not in said recipient whitelist database (1600), then the 
electronic message is quarantined for tljM reeipleot: to said quarantine database, 

39. The system of claim 38, wherein If the quarantined electronic message is 
retrieved from said quarantine, database, then the sender address is added to said 
recipient whitelist database (1600). 

40. Hie system of claim 38 , 'further comprising a blacklist database (1095) including 
a list of blacklisted remote .hosts, wherein if the quarantined electronic message is 
retrieved from said quarantine database, then the blacklisted remote host is removed 
from said blacklist database, 

41. The system of claim 37, further comprising a blacklist database (1 095) having a 
list of blacklisted network addresses, wherein the remote host is added to the blacklist 
database if any of the at least one filter determines that the electronic message Is 
undesirable and the sender address is not matched in any recipient whitehst database 
(1600), 

42. The system of claim 3? f wherein if the sender address is not in said recipient 
whitelist database (1600), then the at least Ope filter determines i f the electronic 
message is undesirable, 

43. The system of claim 37, wherein if the atleast one filter determines that the 
message is undesirable, theft, said $$ti^.,%ce0® 'tite :electrpntc message for that 
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recipient iftfa© sender address is in mti reeijsieM wbifelsst database (1600). 
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